Connect to Pig
The Bitnami Hadoop Stack includes Pig, a platform for analyzing large data sets that consist of a high-level language for expressing data analysis programs, coupled with infrastructure for evaluating these programs.
To use Pig, simply run:
After a few moments, you will see the grunt prompt:
In order to run the Pig tutorial scripts, you will first need to upload a file to HDFS:
$ hadoop fs -copyFromLocal installdir/hadoop/pig/tutorial/data/excite.log.bz2 .
In this case we will run script1-hadoop.pig, which you can then run as following:
$ cd installdir/hadoop/pig/tutorial $ pig ./scripts/script1-hadoop.pig
The process takes some minutes, but once it finishes, you will find some output similar to the following indicating success:
Input(s): Successfully read 944954 records (10409092 bytes) from: "hdfs://localhost:8020/user/hadoop/excite.log.bz2" Output(s): Successfully stored 13530 records (659954 bytes) in: "hdfs://localhost:8020/user/hadoop/script1-hadoop-results" (...) 2018-02-20 09:11:36,947 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Success! 2018-02-20 09:11:36,976 [main] INFO org.apache.pig.Main - Pig script completed in 3 minutes, 21 seconds and 433 milliseconds (201433 ms)