我正在尝试使用 hadoop 在我的 MAC OS X 10.9.2 上完成开发单节点集群设置.我尝试了各种在线教程,最近的是 这个.总结一下我所做的:
I'm trying to get a development single-node cluster setup done on my MAC OS X 10.9.2 with hadoop. I've tried various online tutorials, with the most recent being this one. To summarize what I did:
1) $ brew install hadoop
这在/usr/local/Cellar/hadoop/2.2.0 中安装了 hadoop 2.2.0
This installed hadoop 2.2.0 in /usr/local/Cellar/hadoop/2.2.0
2) 配置的环境变量.这是我的 .bash_profile 的相关部分的样子:
2) Configured Environment Variables. Here's what the relevant part of my .bash_profile looks like:
### Java_HOME
export JAVA_HOME="$(/usr/libexec/java_home)"
### HADOOP Environment variables
export HADOOP_PREFIX="/usr/local/Cellar/hadoop/2.2.0"
export HADOOP_HOME=$HADOOP_PREFIX
export HADOOP_COMMON_HOME=$HADOOP_PREFIX
export HADOOP_CONF_DIR=$HADOOP_PREFIX/libexec/etc/hadoop
export HADOOP_HDFS_HOME=$HADOOP_PREFIX
export HADOOP_MAPRED_HOME=$HADOOP_PREFIX
export HADOOP_YARN_HOME=$HADOOP_PREFIX
export CLASSPATH=$CLASSPATH:.
export CLASSPATH=$CLASSPATH:$HADOOP_HOME/libexec/share/hadoop/common/hadoop-common-2.2.0.jar
export CLASSPATH=$CLASSPATH:$HADOOP_HOME/libexec/share/hadoop/hdfs/hadoop-hdfs-2.2.0.jar
3) 配置的 HDFS
3) Configured HDFS
<configuration>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:///usr/local/Cellar/hadoop/2.2.0/hdfs/datanode</value>
<description>Comma separated list of paths on the local filesystem of a DataNode where it should store its blocks.</description>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:///usr/local/Cellar/hadoop/2.2.0/hdfs/namenode</value>
<description>Path on the local filesystem where the NameNode stores the namespace and transaction logs persistently.</description>
</property>
</configuration>
3) 配置 core-site.xml
3) Configured core-site.xml
<!-- Let Hadoop modules know where the HDFS NameNode is at! -->
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost/</value>
<description>NameNode URI</description>
</property>
4) 配置yarn-site.xml
4) Configured yarn-site.xml
<configuration>
<property>
<name>yarn.scheduler.minimum-allocation-mb</name>
<value>128</value>
<description>Minimum limit of memory to allocate to each container request at the Resource Manager.</description>
</property>
<property>
<name>yarn.scheduler.maximum-allocation-mb</name>
<value>2048</value>
<description>Maximum limit of memory to allocate to each container request at the Resource Manager.</description>
</property>
<property>
<name>yarn.scheduler.minimum-allocation-vcores</name>
<value>1</value>
<description>The minimum allocation for every container request at the RM, in terms of virtual CPU cores. Requests lower than this won't take effect, and the specified value will get allocated the minimum.</description>
</property>
<property>
<name>yarn.scheduler.maximum-allocation-vcores</name>
<value>2</value>
<description>The maximum allocation for every container request at the RM, in terms of virtual CPU cores. Requests higher than this won't take effect, and will get capped to this value. </description>
</property>
<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>4096</value>
<description>Physical memory, in MB, to be made available to running containers</description>
</property>
<property>
<name>yarn.nodemanager.resource.cpu-vcores</name>
<value>2</value>
<description>Number of CPU cores that can be allocated for containers.</description>
</property>
</configuration>
5) 然后我尝试使用以下方式格式化名称节点:
5) Then I tried to format the namenode using:
$HADOOP_PREFIX/bin/hdfs namenode -format
这给了我错误:错误:无法找到或加载主类 org.apache.hadoop.hdfs.server.namenode.NameNode.
This gives me the error: Error: Could not find or load main class org.apache.hadoop.hdfs.server.namenode.NameNode.
我看了hdfs代码,运行它的那行基本相当于调用
I looked at the hdfs code, and the line that runs it basically amounts to calling
$java org.apache.hadoop.hdfs.server.namenode.NameNode.
所以认为这是一个类路径问题,我尝试了一些方法
So thinking this was a classpath issue, I tried a few things
a) 将 hadoop-common-2.2.0.jar 和 hadoop-hdfs-2.2.0.jar 添加到类路径中,正如您在上面的 .bash_profile 脚本中看到的那样
a) adding hadoop-common-2.2.0.jar and hadoop-hdfs-2.2.0.jar to the classpath as you can see above in my .bash_profile script
b) 添加行
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
根据本教程的推荐到我的 .bash_profile 中.(我后来删除了它,因为它没有似乎没有任何帮助)
to my .bash_profile on the recommendation of this tutorial.(I later removed it because it didn't seem to help anything)
c) 我还考虑编写一个 shell 脚本,将 $HADOOP_HOME/libexec/share/hadoop 中的每个 jar 添加到 $HADOOP_CLASSPATH 中,但这似乎没有必要,而且很容易在未来出现问题.
c) I also considered writing a shell script that adds every jar in $HADOOP_HOME/libexec/share/hadoop to the $HADOOP_CLASSPATH, but this just seemed unnecessary and prone to future problems.
知道为什么我不断收到错误消息:无法找到或加载主类 org.apache.hadoop.hdfs.server.namenode.NameNode 吗?提前致谢.
Any idea why I keep getting the Error: Could not find or load main class org.apache.hadoop.hdfs.server.namenode.NameNode ? Thanks in advance.
由于brew包的布局方式,需要将HADOOP_PREFIX指向包中的libexec文件夹:
Due to the way the brew package is laid out, you need to point the HADOOP_PREFIX to the libexec folder in the package:
export HADOOP_PREFIX="/usr/local/Cellar/hadoop/2.2.0/libexec"
然后,您将从 conf 目录的声明中删除 libexec:
You would then remove the libexec from your declaration of the conf directory:
export HADOOP_CONF_DIR=$HADOOP_PREFIX/etc/hadoop
这篇关于尝试格式化 namenode 时找不到或加载主类;在 MAC OS X 10.9.2 上安装 hadoop的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持跟版网!