想了解hadoop,所以就下了最新的文件,搭建相关的环境,以下为自己在win7上搭建hadoop的相关步骤。
1、下载hadoop,用winRAR解压。(路径不能有空格)
2、安装java,此处就不做讲解了。(java路径不能有空格)
3、配置hadoop的环境变量HADOOP_HOME=”HADOOP的安装目录”
4、在环境变量的PATH中加入HADOOP的安装目录/bin
5、修改hadoop目录下etc/hadoop/hadoop-env.cmd文件中的JAVA_HOME变量为当前java的安装路径。
6、下载hadoop-commin-2.2.zip,因为在windows下运行,压缩包里面缺少 winutils.exe, hadoop.dll等文件,下载完成后,将要报下面的bin目录下的所有文件全部拷贝到hadoop目录下的bin文件夹下。
7、修改配置文件,以下四个文件全部在hadoop目录下的etc/hadoop目录下
修改core-site.xml,如下:
<?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. See accompanying LICENSE file. --> <!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>fs.defaultFS</name> <value>hdfs://localhost:9000</value> </property> </configuration>
修改hdfs-site.xml如下:
<?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. See accompanying LICENSE file. --> <!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>dfs.replication</name> <value>1</value> </property> <property> <name>dfs.namenode.name.dir</name> <value>file:/hadoop/data/dfs/namenode</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>file:/hadoop/data/dfs/datanode</value> </property> </configuration>
修改yarn-site.xml如下:
<?xml version="1.0"?> <!-- Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. See accompanying LICENSE file. --> <configuration> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name> <value>org.apache.hadoop.mapred.ShuffleHandler</value> </property> </configuration>
修改mapred-site.xml如下:
<?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. See accompanying LICENSE file. --> <!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> </configuration>
然后打开cmd,运行hadoop namenode -format命令,运行结果基本如下:
Microsoft Windows [Version 6.1.7601] Copyright (c) 2009 Microsoft Corporation. All rights reserved. C:\Users\abhijitg>cd c:\hadoop\bin c:\hadoop\bin>hdfs namenode -format 13/11/03 18:07:47 INFO namenode.NameNode: STARTUP_MSG: /************************************************************ STARTUP_MSG: Starting NameNode STARTUP_MSG: host = ABHIJITG/x.x.x.x STARTUP_MSG: args = [-format] STARTUP_MSG: version = 2.2.0 STARTUP_MSG: classpath = <classpath jars here> STARTUP_MSG: build = Unknown -r Unknown; compiled by ABHIJITG on 2013-11-01T13:42Z STARTUP_MSG: java = 1.7.0_03 ************************************************************/ Formatting using clusterid: CID-1af0bd9f-efee-4d4e-9f03-a0032c22e5eb 13/11/03 18:07:48 INFO namenode.HostFileManager: read includes: HostSet( ) 13/11/03 18:07:48 INFO namenode.HostFileManager: read excludes: HostSet( ) 13/11/03 18:07:48 INFO blockmanagement.DatanodeManager: dfs.block.invalidate.limit=1000 13/11/03 18:07:48 INFO util.GSet: Computing capacity for map BlocksMap 13/11/03 18:07:48 INFO util.GSet: VM type = 64-bit 13/11/03 18:07:48 INFO util.GSet: 2.0% max memory = 888.9 MB 13/11/03 18:07:48 INFO util.GSet: capacity = 2^21 = 2097152 entries 13/11/03 18:07:48 INFO blockmanagement.BlockManager: dfs.block.access.token.enable=false 13/11/03 18:07:48 INFO blockmanagement.BlockManager: defaultReplication = 1 13/11/03 18:07:48 INFO blockmanagement.BlockManager: maxReplication = 512 13/11/03 18:07:48 INFO blockmanagement.BlockManager: minReplication = 1 13/11/03 18:07:48 INFO blockmanagement.BlockManager: maxReplicationStreams = 2 13/11/03 18:07:48 INFO blockmanagement.BlockManager: shouldCheckForEnoughRacks = false 13/11/03 18:07:48 INFO blockmanagement.BlockManager: replicationRecheckInterval = 3000 13/11/03 18:07:48 INFO blockmanagement.BlockManager: encryptDataTransfer = false 13/11/03 18:07:48 INFO namenode.FSNamesystem: fsOwner = ABHIJITG (auth:SIMPLE) 13/11/03 18:07:48 INFO namenode.FSNamesystem: supergroup = supergroup 13/11/03 18:07:48 INFO namenode.FSNamesystem: isPermissionEnabled = true 13/11/03 18:07:48 INFO namenode.FSNamesystem: HA Enabled: false 13/11/03 18:07:48 INFO namenode.FSNamesystem: Append Enabled: true 13/11/03 18:07:49 INFO util.GSet: Computing capacity for map INodeMap 13/11/03 18:07:49 INFO util.GSet: VM type = 64-bit 13/11/03 18:07:49 INFO util.GSet: 1.0% max memory = 888.9 MB 13/11/03 18:07:49 INFO util.GSet: capacity = 2^20 = 1048576 entries 13/11/03 18:07:49 INFO namenode.NameNode: Caching file names occuring more than 10 times 13/11/03 18:07:49 INFO namenode.FSNamesystem: dfs.namenode.safemode.threshold-pct = 0.9990000128746033 13/11/03 18:07:49 INFO namenode.FSNamesystem: dfs.namenode.safemode.min.datanodes = 0 13/11/03 18:07:49 INFO namenode.FSNamesystem: dfs.namenode.safemode.extension = 30000 13/11/03 18:07:49 INFO namenode.FSNamesystem: Retry cache on namenode is enabled 13/11/03 18:07:49 INFO namenode.FSNamesystem: Retry cache will use 0.03 of total heap and retry cache entry expiry time is 600000 millis 13/11/03 18:07:49 INFO util.GSet: Computing capacity for map Namenode Retry Cache 13/11/03 18:07:49 INFO util.GSet: VM type = 64-bit 13/11/03 18:07:49 INFO util.GSet: 0.029999999329447746% max memory = 888.9 MB 13/11/03 18:07:49 INFO util.GSet: capacity = 2^15 = 32768 entries 13/11/03 18:07:49 INFO common.Storage: Storage directory \hadoop\data\dfs\namenode has been successfully formatted. 13/11/03 18:07:49 INFO namenode.FSImage: Saving image file \hadoop\data\dfs\namenode\current\fsimage.ckpt_00000000000000 00000 using no compression 13/11/03 18:07:49 INFO namenode.FSImage: Image file \hadoop\data\dfs\namenode\current\fsimage.ckpt_0000000000000000000 o f size 200 bytes saved in 0 seconds. 13/11/03 18:07:49 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 0 13/11/03 18:07:49 INFO util.ExitUtil: Exiting with status 0 13/11/03 18:07:49 INFO namenode.NameNode: SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down NameNode at ABHIJITG/x.x.x.x ************************************************************/
然后在cmd下切换目录到hadoop目录下的sbin目录下,运行start-all 会打开四个cmd窗口,可以打开浏览器输入 http://localhost:8042以及http://localhost:50070查看是否配置成功!
如果50070打不开,就重新格式化,然后重启hadoop就可以了