Showing posts with label namenode. Show all posts
Showing posts with label namenode. Show all posts

Sunday, November 23, 2014

Hadoop Tips: Change default namenode and datanode directories

When we start Hadoop in psudo-distributed mode using the sbin/start-all.sh command for the first time, The default directory will be created in /tmp/ directory. The problem arises if you restart your machine the created directories will be deleted and you can't start your hadoop again.

To solve this problem you can change the default directory using configuration file that you can find in your hadoop's etc/hadoop/hdfs-site.xml. Add these configuration properties:

    <!--for namenode-->
    <property>
        <name>dfs.namenode.name.dir</name>
        <value>file:///path/to/your/namenode</value>
    </property>

    <!--for datanode-->
    <property>
        <name>dfs.datanode.data.dir</name>
        <value>file:///path/to/your/datanode/</value>
    </property>

But please make sure, your namenode and datanode directories exist. And don't forget to set the properties value using correct URI (started with file://) format just like in the example. After that you can format your namenode using this command:

$ bin/hdfs namenode -format

and start your hadoop again using:

$ sbin/start-all.sh

If the namenonde or datanode still not working, you can check log files to see the problem.
  
Hope this tips could help you. If you find other problem related to this, please leave a comment below. Cheers! :)

Tuesday, November 4, 2014

Hadoop Tips: Useful url in Hadoop system

For this several weeks I have installed and played with hadoop system. And a lot of thing I need to learn about it. So, I want to make this post so I don't forget what I have learn so far. For installation tutorial you can follow this (hadoop 2.5.0) good tutorial.

There are some url that is useful for administrating Hadoop 2.5.0 after you run the system using start-all.sh located in sbin directory. I want to write down the list down below:

1. NameNode (NN) Web UI: localhost:50070
There are several tabs in this website. In NN UI Overview tab you can see the NN status, how much storage do you have in total, used space, free space, and other statistics about your system. In the Datanode tab  you can find information about all of your functioning datanodes and decomissioned datanode. The Snapshot tab contains information about your created Snapshot. You can see your startup progress in Startup tab. The last tab, Utilities, is also very useful, you can find links to the file system browser and the log browser in that tab.

You can access your hadoop file system (HDFS) browser from http://localhost:50070/explorer.html. You can see your created directories structure from here, but you can't do things like deleting, renaming, or modifying your file system, it only let you to see your directories and files. If you want to edit your directories or files, you can read my other post later (I will write it for you :D). The last link is about log explorer that you can find in http://localhost:50070/logs/. You can find all logs created by datanode, namenode, secondary namenode, resource manager, etc.

2. ResourceManager Web UI: localhost:8088
In this Resource manager you can see a lot information about you cluster, nodes, applications, scheduler, and many more.

--------------------------------------
I haven't explore all of Hadoop feature, but I hope you can find this post useful. Please leave a comment if there is any question or you find my post useful. Cheers!

Finally, C# 9 record, the equivalent of Scala's case class

While C# is a wonderful programming language, there is something that I would like to see to make our life programmer easier. If you are fam...