Map Reduce and Java on hadoop

  1. Write your code. Notice from the example here what you need to import in hadoop, and the basic syntax of Map and Reduce.
  2. Compile your code. You will need to do javac -classpath /usr/local/hadoop/hadoop-0.16.1-core.jar foo.java to compile the code.
  3. Jar the code. jar cvfe foogary.jar Foo Foo*.class will package up all of the appropriate Foo classes (in particular the map and reduce sub-classes) into foo.jar and will also annoint Foo as the main class so hadoop will know where to start running things.
  4. hadoop dfs -copyFromLocal foogary.jar foogary.jar copies the jar file into the distributed file system, which will allow the non-local nodes to find it, love it, and use the classes in it.
  5. hadoop jar foogary.jar inputArguments outputFileDirectory will run the code in hadoop
  6. hadoop dfs -copyToLocal outputFileDirectory outputFileDirectory will copy the output back to your local system.

Gary Lewandowski
Last modified: Tue Apr 8 12:46:47 EDT 2008