CS491/591PI/Word Count AWS
For this assignment you will run PI and Wordcount using Hadoop on AWS. You will then make some changes to create your own version of WordCount to run on AWS.
The assignment will required you to install Hadoop on your AMI, set up the Hadoop environment and format the HDFS.
Execute pi (the jar file is already there) using different parameters. To run the pi program you will issue a command that looks like this:
bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.3.jar pi 16 1100
The second to the last number is the number of mappers and the last is the number of random points counted.
You should run pi with different values for these and see how it affects the run time and accuracy - keep track of your results, see if you can find a trend. Do some research and find out what map and reduce are doing to compute pi.
You will submit a report that includes: describe what map is doing, describe what reduce is doing, present your results, discuss the results and indicate if any of your results surprised you. Include graphs. email to: email@example.com
Download the bible+shakes and run wordcount. Again, the jar file is already available.
You will create your own WordCount.java file and edit the file so that it:
1. Counts the number of times the word "hope" or "Hope" occurs (count both lower and uppercase together)
2. Counts the number of words that start with the letter "v" or "V" (both lower case and uppercase counted together)
Example if "very" occurs 2 times, "Very" occurs 1 time and "variety" 4 times, the count is 7
3. Counts the number of words that start with the letter "a" or "A" (both lower case and uppercase counted together)
4. Counts the number of words that start with the letters "st" or "St" (both counted together).
Once you have your Modified WordCount running you should email the ip address of your VM to: firstname.lastname@example.org. He will notify you when he has graded your assignment so you can terminate your VM.
Email your discussion of PI to email@example.com
Directions for completing the assignment.