Sunday, November 23, 2014

Hadoop Tips: Change default namenode and datanode directories

When we start Hadoop in psudo-distributed mode using the sbin/start-all.sh command for the first time, The default directory will be created in /tmp/ directory. The problem arises if you restart your machine the created directories will be deleted and you can't start your hadoop again.

To solve this problem you can change the default directory using configuration file that you can find in your hadoop's etc/hadoop/hdfs-site.xml. Add these configuration properties:

    <!--for namenode-->
    <property>
        <name>dfs.namenode.name.dir</name>
        <value>file:///path/to/your/namenode</value>
    </property>

    <!--for datanode-->
    <property>
        <name>dfs.datanode.data.dir</name>
        <value>file:///path/to/your/datanode/</value>
    </property>

But please make sure, your namenode and datanode directories exist. And don't forget to set the properties value using correct URI (started with file://) format just like in the example. After that you can format your namenode using this command:

$ bin/hdfs namenode -format

and start your hadoop again using:

$ sbin/start-all.sh

If the namenonde or datanode still not working, you can check log files to see the problem.
  
Hope this tips could help you. If you find other problem related to this, please leave a comment below. Cheers! :)

No comments:

Post a Comment

Text Editor Tips: Some Useful Brackets Extension

Brackets  is an open source and free-to-use text editor and primarily being used by web developers and designer. I mainly use Brackets for d...