Tuesday, November 4, 2014

Hadoop Tips: File system manipulation / modification commands

If you read my previous post about Hadoop useful URL, I promise you to write about file system manipulation commands like add new file/directory, renaming file/directory, or deleting file/directory. You can do easily if you familiar with Linux command.

In Hadoop 2.5.0, all filesystem manipulation command can be done using the 'hdfs' files that you can find in hadoop bin/ directory. The usage pattern of that command is as below:

$ hdfs dfs -<command>

Here are the commands that you need to know:

1. ls

"ls" command let you to show the content of your current directory. you can add -R option to show the content all of your directories recursively.

$ hdfs dfs -ls [-R] [-h] [-d]
$ hdfs dfs -ls -R

2. put

"put" command can be use to put or upload your local file/directory into HDFS. If you not specify the file, it will put all your directory content to the HDFS destination directory. Here's the example to use it:

$ hdfs dfs -put <localpath> <hdfs path>
$ hdfs dfs -put local-file.txt destination-file.txt

3. mkdir

You create a directory in your HDFS by using mkdir command.

$ hdfs dfs -mkdir <destinationpath>/<directory name>
$ hdfs dfs -mkdir /user/username/new-directory
$ hdfs dfs -mkdir new-directory

4. mv

Just like in Linux command, you can use "mv" command to move file or directory  from one location to another location. Or you can also rename file or directory using this command. Here are the examples:

$ hdfs dfs -mv <hdfs old location> <hdfs new location>
$ hdfs dfs -mv /user/username/something.txt /user/username/otherdirectory/
$ hdfs dfs -mv /user/username/onedirectory /user/username/otherdirectory/
$ hdfs dfs -mv <hdfs old name> <hdfs new name>
$ hdfs dfs -mv /user/username/something.txt /user/username/newthing.txt
$ hdfs dfs -mv /user/username/olddirectory /user/username/newdirectory

5. rm

To delete files or directories you can use "rm" command. You can add [-R] option to do the delete recursively into the directory.

$ hdfs dfs -rm [-R] <file/directory to be deleted>
$ hdfs dfs -rm somefile.txt
$ hdfs dfs -rm -R directory

I think that 5 commands will give you "power" to manipulate the HDFS files/directories :)

If you want more complete list, you can refer to this documentation. There will be "cat", "touchz", "cp", and many other command.

If you find my post useful, please leave a comment below. Thanks for reading.

No comments:

Post a Comment

Tips: How to ssh to your Digitalocean server without password

If you are tired of being asked for a password when accessing your remote droplet servers in Digitalocean, you might consider adding an rsa ...