Presentation is loading. Please wait.

Presentation is loading. Please wait.

Hadoop Demo Presented by: Imranul Hoque 1. Topics Hadoop running modes – Stand alone – Pseudo distributed – Cluster Running MapReduce jobs Status/logs.

Similar presentations


Presentation on theme: "Hadoop Demo Presented by: Imranul Hoque 1. Topics Hadoop running modes – Stand alone – Pseudo distributed – Cluster Running MapReduce jobs Status/logs."— Presentation transcript:

1 Hadoop Demo Presented by: Imranul Hoque 1

2 Topics Hadoop running modes – Stand alone – Pseudo distributed – Cluster Running MapReduce jobs Status/logs Sample MapReduce code 2

3 Required Software Hadoop (release 0.18.3) – http://apache.osuosl.org/hadoop/core/hadoop- 0.18.3/hadoop-0.18.3.tar.gz http://apache.osuosl.org/hadoop/core/hadoop- 0.18.3/hadoop-0.18.3.tar.gz Java Development Kit (jdk 1.6.0_01) – http://java.sun.com/javase/downloads/index.jsp http://java.sun.com/javase/downloads/index.jsp Ant (ant 1.7.1) – http://apache.inetbridge.net/ant/binaries/apache -ant-1.7.1-bin.tar.gz http://apache.inetbridge.net/ant/binaries/apache -ant-1.7.1-bin.tar.gz 3

4 Setup NameNode: sherpa01JobTracker: sherpa02 DataNode/TaskTracker: sherpa05, sherpa06 4

5 Assumptions ssh must be installed and sshd must be running Shared home directory (nfs) across all nodes in the cluster (makes life easier) 5

6 Steps Install JDK, ant Passphraseless ssh Compiling Hadoop Setting up config parameters Starting up Hadoop Running jobs Job status 6

7 Passphraseless ssh SourceDestination 1.Generate private-public key-pair 2.~/.ssh/id_dsa and ~/.ssh/id_dsa.pub 3.Send the public key to Destination 3.Add the public key to the authorized key list ~/.ssh/authorized_keys 7

8 Passphraseless ssh (2) NFS 1.ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa 2.cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys (four times) 3.Modify hostname in authorized_keys sherpa01sherpa02sherpa05sherpa06 Add “StrictHostKeyChecking no” in /etc/ssh/ssh_config to turn off prompt 8

9 Setting the PATH JAVA_HOME=/usr/java/jdk1.6.0_01 ANT_HOME=~/ant PATH=/usr/java/jdk1.6.0_01/bin:$PATH PATH=~/ant/bin:$PATH 9

10 Installing and Configuring Hadoop Extract Build (ant) Modify conf/hadoop-env.sh: – export JAVA_HOME=/usr/java/jdk1.6.0_01 Inform Hadoop of the Masters and Slaves – conf/masters – conf/slaves Modify conf/hadoop-site.xml 10

11 Rack Awareness topology.script.file.name conf/fakedns.sh In fakedns.sh: – echo /rack_id 11

12 Staring Hadoop Format Namenode FS (sherpa01): – bin/hadoop namenode -format From NameNode (sherpa01): – bin/start-dfs.sh From JobTracker (sherpa02): – bin/start-mapred.sh 12

13 Running MapReduce Copy data to HDFS – bin/hadoop dfs -copyFromLocal ~/data gutenberg Run MapReduce – bin/hadoop jar hadoop-0.18.3-examples.jar wordcount -r 6 gutenberg gutenberg-output Some HDFS commands – copyToLocal, cat, cp, rm, du, ls, etc. 13

14 Job/Node Status NameNode: – http://sherpa01.cs.uiuc.edu:50001 http://sherpa01.cs.uiuc.edu:50001 DataNode: – http://sherpa02.cs.uiuc.edu:50002 http://sherpa02.cs.uiuc.edu:50002 Also look at the logs: – logs/ 14

15 WordCount.java src/examples/org/apache/hadoop/examples/ WordCount.java – Map function – Reduce function – Driver function 15

16 Shutdown From NameNode (sherpa01): – bin/stop-dfs.sh From JobTracker (sherpa02): – bin/stop-mapred.sh 16

17 Conclusion For more details: – http://hadoop.apache.org/core/ http://hadoop.apache.org/core/ – http://wiki.apache.org/hadoop/ http://wiki.apache.org/hadoop/ 17


Download ppt "Hadoop Demo Presented by: Imranul Hoque 1. Topics Hadoop running modes – Stand alone – Pseudo distributed – Cluster Running MapReduce jobs Status/logs."

Similar presentations


Ads by Google