Вы находитесь на странице: 1из 28

Big Data and Hadoop

HADOOP 2.6.0 INSTALLATION GUIDE

Step 1: Download VMware from this link:


https://my.vmware.com/web/vmware/downloads

Step 2: Download the Ubuntu 14.04 image:


http://www.traffictool.net/vmware/ubuntu1404t.html

www.skillpeed.com Page 1
Step 3: Extract the file in a location on your machine

Step 4: Open VM and click on open a virtual machine

Step 5: Now add your extract: Ubuntu

Now your VMware would look like the image given below. Click on edit virtual machine setting

www.skillpeed.com Page 2
Step 6: Increase your VMware memory min (6 GB), then click ok.

www.skillpeed.com Page 3
Now click on play virtual machine

Youll see the console like:

www.skillpeed.com Page 4
Step 7: Click on Terminal

www.skillpeed.com Page 5
Step 8: Install Java and add the repository as shown in screenshot below:

Step 9: It will ask for password, enter password

And then press enter again as show below:

www.skillpeed.com Page 6
Step 10: Now update your system

It will take some minutes to complete. After completion the screen will look below:

www.skillpeed.com Page 7
www.skillpeed.com Page 8
Step 11: Now invoke the Java-7-installer as shown below:

It will ask for accepting the license. Hit OK as shown

www.skillpeed.com Page 9
Step 12: Click on Yes

www.skillpeed.com Page 10
Step 13: Java installation will continue for some time. Once it is complete, the screen
would look like the following:

Step 14: Confirm Java version as shown below:

www.skillpeed.com Page 11
Step 15: Now install openssh server

Step 16: Generate ssh keys

www.skillpeed.com Page 12
Step 17: Just press enter for password, dont enter any password by yourself. Once done it
will look like below:

Step 18: Enable ssh access

www.skillpeed.com Page 13
Step 19: Test ssh acces

Step 20: Now disable the IPv6 as shown below:

Open the sysctl.conf file as follows:

www.skillpeed.com Page 14
Step 21: Add the lines shown in screenshot below:

# IPv6 disabled

net.ipv6.conf.all.disable_ipv6 = 1

net.ipv6.conf.default.disable_ipv6 = 1

net.ipv6.conf.lo.disable_ipv6 = 1

Step 22: Next reboot your virtual machinen

Step 23: Once you restart the machine test if IPv6 is disabled, your output should be as
shown in the screenshot below:

www.skillpeed.com Page 15
Step 24: Now we are ready for Hadoop Installation

Download Hadoop-2.6.0 from this link:

http://supergsego.com/apache/hadoop/common/hadoop-2.6.0/

Step 25: Now extract the tar file as shown below

www.skillpeed.com Page 16
Step 26: You would see Hadoop folder as follows

Step 27: Open your .basrc file as shown below:

Step 28: Add the highlighted lines as shown below:

www.skillpeed.com Page 17
# -- HADOOP ENVIRONMENT VARIABLES START -- #

export JAVA_HOME=/usr/lib/jvm/java-7-oracle

export HADOOP_HOME=/home/user/hadoop-2.6.0

export PATH=$PATH:$HADOOP_HOME/bin

export PATH=$PATH:$HADOOP_HOME/sbin

export HADOOP_MAPRED_HOME=$HADOOP_HOME

export HADOOP_COMMON_HOME=$HADOOP_HOME

export HADOOP_HDFS_HOME=$HADOOP_HOME

export YARN_HOME=$HADOOP_HOME

export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native

export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib"

# -- HADOOP ENVIRONMENT VARIABLES END -- #

Step 29: You should now close the terminal and open a new terminal to check Hadoop
and Java homes as shown below: Write command

echo $JAVA_HOME

echo $HADOOP_INSTALL

Step 30: Now we will configure Hadoop. Go to hadoop-2.6.0/etc/hadoop in Hadoop folder


(Command :- cd ~/hadoop-2.6.0/etc/hadoop/), you should be able see all the config
files as shown below:

www.skillpeed.com Page 18
Step 31: Add the highlighted lines as shown below:

export JAVA_HOME=/usr/lib/jvm/java-7-oracle

Step 32:Open edit core-site.xml and add the highlighted lines as shown below:

www.skillpeed.com Page 19
Step 33: Edit core-site.xml as followed and add the highlighted lines as shown below:

<configuration>

<property>

<name>fs.default.name</name>

<value>hdfs://localhost:9000</value>

</property>

</configuration>

www.skillpeed.com Page 20
Step 34: Open yarn-sit.xml

Step 35: Edit yarn-site.xml as follows and add the highlighted lines as shown below:

<configuration>

<!-- Site specific YARN configuration properties -->

<property>

<name>yarn.nodemanager.aux-services</name>

www.skillpeed.com Page 21
<value>mapreduce_shuffle</value>

</property>

<property>

<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>

<value>org.apache.hadoop.mapred.ShuffleHandler</value>

</property>

</configuration>

To open mapred-site.xml take this step

(Command :- sudo cp mapred-site.xml.templete mapred-site.xml)

www.skillpeed.com Page 22
Step 36: Open mapred-site.xml

Step 37: Add the highlighted lines as shown below:

<configuration>

<property>

<name>mapreduce.framework.name</name>

<value>yarn</value>

</property>

www.skillpeed.com Page 23
</configuration>

Step 38: For hdfs - Create two directories for Hadoop storage

Commands:- sudo mkdir -p hadoop_tmp/hdfs/namenode

sudo mkdir -p hadoop_tmp/hdfs/datanode

Step 39: Open hdfs-site.xml

www.skillpeed.com Page 24
Step 40: Add the highlighted lines as shown below:

<configuration>

<property>

<name>dfs.replication</name>

<value>1</value>

</property>

<property>

<name>dfs.namenode.name.dir</name>

<value>file:/home/user/hadoop_tmp/hdfs/namenode</value>

</property>

<property>

<name>dfs.datanode.data.dir</name>

www.skillpeed.com Page 25
<value>file:/home/user/hadoop_tmp/hdfs/datanode</value>

</property>

</configuration>

Change folder permission

Write this command

sudo chown -R user:user /home/user/hadoop-2.6.0/etc/hadoop

sudo chown -R user:user /home/user/hadoop_tmp

Step 41: Format namenode:( Command: - hdfs namenode -format )

Step 42: Start all the services of Hadoop as shown below:

www.skillpeed.com Page 26
Step 43: You can confirm the services by running jps command as shown below:

STEP 44 : Type command to check the directory of hdfs

Command:- hadoop fs -ls / (If nothing exists here then)

Command:- hadoop fs -mkdir /user

OR you can see the directories through the web browser.

Link :- localhost:50070

Go to Utilities Click Browse the file system.

Here all hadoop hdfs files in your system.

www.skillpeed.com Page 27
Link :- http://localhost:50070/dfshealth.jsp

Eclipse : sudo wget http://eclipse.stu.edu.tw/technology/epp/downloads/release/mars/1/eclipse-jee-mars-1-linux-gtk.tar.gz

www.skillpeed.com Page 28

Вам также может понравиться