1 官方GREP案例
By default, Hadoop is configured to run in a non-distributed mode, as a single Java process. This is useful for debugging.
The following example copies the unpacked conf directory to use as input and then finds and displays every match of the given regular expression. Output is written to the given output directory.
mkdir input
cp etc/hadoop/*.xml input
#执行hadoop-mapreduce-examples-2.7.2.jar grep来对数据源进行计算。统计dfs开头的单词
bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.2.jar grep input output 'dfs[a-z.]+'
cat output/*
2 官方WordCount案例
1 创建一个文件夹wcinput
mkdir wcinput
2 创建文件源文件并添加信息
touch word.txt
hadoop yarn
hadoop mapreduce
3 执行wordCoount程序
bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.2.jar wordcount wcinput/ wcoutput
4 查看结果
cat wcoutput/part-r-00000