The example description for Hadoop Streaming on Azure has some path typo error, hence you may struggle with following error:
1 2 3 4 5 6 |
|
The problem is with the “HDFS://…” arguments. The sample gives the path like below
1
|
|
This means the application is resided in example/apps. When you access through HDFS URL, it starts from root directory, hence it founds any existence of “example” director in the root directory. However, your actual directory in user/<your_user_name>, where “user” is in the root directory.
Hence, the HDFS URL should be hdfs://xxx.xxx.xxx.xxx:9000/user/user_name/example/apps/wc.exe
Also, another parameter -input “/example/data/davinci.txt” -output “/example/data/StreamingOutput/wc.txt” actually mentions the input data and output directory. Here also the “example” directory starts from root directory. Instead, it should be “example/data”, which resolves to “user/user_name/example/data”.