本文介绍安装spark单机环境的方法,可用于测试及开发。主要分成以下4部分:
(1)环境准备
(2)安装scala
(3)安装spark
(4)验证安装情况
1、环境准备
(1)配套软件版本要求:Spark runs on Java 6+ and Python 2.6+. For the Scala API, Spark 1.3.1 uses Scala 2.10. You will need to use a compatible Scala version (2.10.x).
(2)安装好linux、jdk、python, 一般linux均会自带安装好jdk与python,但注意jdk默认为openjdk,建议重新安装oracle jdk。
(3)IP:10.171.29.191 hostname:master
2、安装scala
(1)下载scala
wget http://downloads.typesafe.com/scala/2.10.5/scala-2.10.5.tgz
(2)解压文件
tar -zxvf scala-2.10.5.tgz
(3)配置环境变量
#vi/etc/profile
#SCALA VARIABLES START
export SCALA_HOME=/home/jediael/setupfile/scala-2.10.5
export PATH=$PATH:$SCALA_HOME/bin
#SCALA VARIABLES END
$ source /etc/profile
$ scala -version
Scala code runner version 2.10.5 -- Copyright 2002-2013, LAMP/EPFL
(4)验证scala
$ scala
Welcome to Scala version 2.10.5 (Java HotSpot(TM) 64-Bit Server VM, Java 1.7.0_51).
Type in expressions to have them evaluated.
Type :help for more information.
scala> 9*9
res0: Int = 81
3、安装spark
(1)下载spark
wget http://mirror.bit.edu.cn/apache/spark/spark-1.3.1/spark-1.3.1-bin-hadoop2.6.tgz
(2)解压spark
tar -zxvf http://mirror.bit.edu.cn/apache/spark/spark-1.3.1/spark-1.3.1-bin-hadoop2.6.tgz
(3)配置环境变量
#vi/etc/profile
#SPARK VARIABLES START
export SPARK_HOME=/mnt/jediael/spark-1.3.1-bin-hadoop2.6
export PATH=$PATH:$SPARK_HOME/bin
#SPARK VARIABLES END
$ source /etc/profile
(4)配置spark
$ pwd
/mnt/jediael/spark-1.3.1-bin-hadoop2.6/conf
$ mv spark-env.sh.template spark-env.sh
$vi spark-env.sh
export SCALA_HOME=/home/jediael/setupfile/scala-2.10.5
export JAVA_HOME=/usr/java/jdk1.7.0_51
export SPARK_MASTER_IP=10.171.29.191
export SPARK_WORKER_MEMORY=512m
export master=spark://10.171.29.191:7070
$vi slaves
master
(5)启动spark
pwd
/mnt/jediael/spark-1.3.1-bin-hadoop2.6/sbin
$ ./start-all.sh
注意,hadoop也有start-all.sh脚本,因此必须进入具体目录执行脚本
$ jps
30302 Worker
30859 Jps
30172 Master
4、验证安装情况
(1)运行自带示例
$ bin/run-example org.apache.spark.examples.SparkPi
(2)查看集群环境
http://master:8080/
(3)进入spark-shell
$spark-shell
(4)查看jobs等信息
http://master:4040/jobs/
版权声明:本文为博主原创文章,未经博主允许不得转载。