Hadoop 3.1.4 Installation on Window 10

Hadoop 3.1.4 Installation on Window 10

2020, Nov 09    

Below is the procedure in Installing Hadoop

Prepare:

Then you can use actively Hadoop on your system

  1. Download Hadoop 3.1.4 mirror file Link
  2. Java JDK 1.8.0 Link to download

NOTE: Hadoop 3.2.1 have some bugs so the NameNode and DataNode shuts-down.

Set up

  1. Go to cmd and check if Java is alreaddy installed on your system, use “java -version” java version

  2. If Java is not installed on your system then first install java by selecting the location as “C:\JAVA” . If while installation it pops another the destination folder keep it default and run Java setup

  3. Once Hadoop is installed check if it installed correctly, “hadoop version”. Next, Extract file Hadoop 3.1.4.tar.gz or Hadoop-3.1.4.zip and place under “C:\Hadoop- 3.1.4” else rename it to “C:\Hadoop” hadoop

  4. Set the path HADOOP_HOME Environment variable on windows 10. Right click MyPc -> Go to Properties -> Choose Advanced system settings -> Click on Environment Variable. Next click on New on user variable and name it to HADOOP_HOME and set location to C:\Hadoop\bin

    hadoop

  5. Similarly, set the path JAVA_HOME Environment variable on windows 10 and choose the location to C:\Java\jdk1.8.0_271\bin java

  6. Next we set the Hadoop bin directory path and JAVA bin directory path. bin

Configuration

1.Edit file C:/Hadoop/etc/hadoop/core-site.xml, paste below xml paragraph and save this file.

<configuration>
   <property>
       <name>fs.defaultFS</name>
       <value>hdfs://localhost:9000</value>
   </property>
</configuration>

2.Similarly edit file C:/Hadoop/etc/hadoop/mapred-site.xml, paste below xml paragraph and save this file.

<configuration>
   <property>
       <name>mapreduce.framework.name</name>
       <value>yarn</value>
   </property>
</configuration>

3.Create folder “data” under “C:\Hadoop”

  • Create folder “datanode” under “C:\Hadoop\data”
  • Create folder “namenode” under “C:\Hadoop\data” data

4.Edit file C:\Hadoop/etc/hadoop/hdfs-site.xml, paste below xml paragraph and save this file.

<configuration>
   <property>
       <name>dfs.replication</name>
       <value>1</value>
   </property>
   <property>
       <name>dfs.namenode.name.dir</name>
       <value>C://hadoop/data/namenode</value>
   </property>
   <property>
       <name>dfs.datanode.data.dir</name>
       <value>C://hadoop/data/datanode</value>
   </property>
</configuration>

5.Edit file C:/Hadoop/etc/hadoop/yarn-site.xml, paste below xml paragraph and save this file.

<configuration>
   <property>
    	<name>yarn.nodemanager.aux-services</name>
    	<value>mapreduce_shuffle</value>
   </property>
   <property>
      	<name>yarn.nodemanager.auxservices.mapreduce.shuffle.class</name>  
	<value>org.apache.hadoop.mapred.ShuffleHandler</value>
   </property>
</configuration>

6.Edit file C:/Hadoop/etc/hadoop/hadoop-env.cmd by closing the command line “JAVA_HOME=%JAVA_HOME%” instead of set “JAVA_HOME=C:\Java\jdk1.8.0_271”

java path

Hadoop Configuration

  1. Dowload file Hadoop Configuration.zip
  2. Delete file bin on C:\Hadoop\bin, replaced by file bin on file just download (from Hadoop Configuration.zip).
  3. Open cmd and typing command “hdfs namenode –format” . You will see hdfs namenode –format

Testing

  1. Open cmd and change directory to “C:\Hadoop-2.8.0\sbin” and type “start-all.cmd” to start apache. start all
  2. Make sure these apps are running- Use Jps to see the resources running
    • Hadoop Namenode
    • Hadoop datanode
    • YARN Resourc Manager
    • YARN Node Manager hadoop nodes
  3. Open: http://localhost:8088 cluster
  4. Open: http://localhost:9870 hdfs

    Congratulations, Hadoop installed.