我刚刚使用conda安装了pyspark 2.2.0(在Windows 7 64bit,java v1.8上使用python v3.6)
$conda install pyspark
它已下载并似乎正确安装,没有错误.现在,当我在命令行上运行pyspark时,它只是告诉我“系统找不到指定的路径”.
$pyspark
The system cannot find the path specified.
The system cannot find the path specified.
我尝试在PATH环境变量中包含pyspark路径目录,但这似乎仍然无效,但是也许我输入的路径错误?任何人都可以请指教.是否需要在PATH环境变量中指定Java路径?谢谢
解决方法:
来自PyPi的PySpark(即安装了pip或conda)不包含完整的PySpark功能;它仅适用于已经存在的群集中的Spark安装,在这种情况下,您可能需要avoid downloading the whole Spark distribution.从docs:
The Python packaging for Spark is not intended to replace all of the other use cases. This Python packaged version of Spark is suitable
for interacting with an existing cluster (be it Spark standalone,
YARN, or Mesos) – but does not contain the tools required to setup
your own standalone Spark cluster. You can download the full version
of Spark from the 07002.
如果您打算在PySpark Shell中工作,建议您按照上述方式下载Spark(PySpark是它的必要组件).