apache spark kubernets 部署试用

spark 是一个不错的平台,支持rdd 分析stream 机器学习。。。
以下为使用kubernetes 部署的说明,以及注意的地方

具体的容器镜像使用别人已经构建好的

deploy yaml 文件

deploy-k8s.yaml

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: spark-master
namespace: big-data
labels:
app: spark-master
spec:
replicas: 1
template:
metadata:
labels:
app: spark-master
spec:
containers:
- name: spark-master
image: bde2020/spark-master:2.3.1-hadoop2.7
imagePullPolicy: IfNotPresent
ports:
- containerPort: 7077
- containerPort: 8080
env:
- name: ENABLE_INIT_DAEMON
value: "false"
- name: SPARK_MASTER_PORT
value: "7077" --- apiVersion: v1
kind: Service
metadata:
name: spark-master-service
namespace: big-data
spec:
type: NodePort
ports:
- port: 7077
targetPort: 7077
protocol: TCP
name: master
selector:
app: spark-master --- apiVersion: v1
kind: Service
metadata:
name: spark-webui-service
namespace: big-data
spec:
ports:
- port: 8080
targetPort: 8080
protocol: TCP
name: ui
selector:
app: spark-master
type: NodePort --- apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: spark-webui-ingress
namespace: big-data
spec:
rules:
- host: spark-webui.data.com
http:
paths:
- backend:
serviceName: spark-webui-service
servicePort: 8080
path: / --- apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: spark-worker
namespace: big-data
labels:
app: spark-worker
spec:
replicas: 1
template:
metadata:
labels:
app: spark-worker
spec:
containers:
- name: spark-worker
image: bde2020/spark-worker:2.3.1-hadoop2.7
imagePullPolicy: IfNotPresent
env:
- name: SPARK_MASTER
value: spark://spark-master-service:7077
- name: ENABLE_INIT_DAEMON
value: "false"
- name: SPARK_WORKER_WEBUI_PORT
value: "8081"
ports:
- containerPort: 8081 --- apiVersion: v1
kind: Service
metadata:
name: spark-worker-service
namespace: big-data
spec:
type: NodePort
ports:
- port: 8081
targetPort: 8081
protocol: TCP
name: worker
selector:
app: spark-worker ---
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: spark-worker-ingress
namespace: big-data
spec:
rules:
- host: spark-worker.data.com
http:
paths:
- backend:
serviceName: spark-worker-service
servicePort: 8081
path: /

部署&&运行

  • 部署
kubectl apply -f deploy-k8s.yaml
  • 效果

    使用ingress 访问,访问域名 spark-webui.data.com

apache spark kubernets 部署试用
apache spark kubernets 部署试用

说明

  • 命名的问题
平时的习惯是deploy service 命名为一样的,但是就是这个就有问题的,因为k8s 默认会进行环境变量的注入,所以居然冲突的。
解决方法,修改名称,重新发布
具体问题:
dockerfile 中的以下环境变量
ENV SPARK_MASTER_PORT 7077
  • spark 任务运行
具体的运行可以参考官方demo,后期也会添加

参考资料

https://github.com/rongfengliang/spark-k8s-deploy
https://github.com/big-data-europe/docker-spark

 
 
 
 
上一篇:在PC机上,如何用Chrome浏览器模拟查看和调试手机的HTML5页面?


下一篇:Bootstrap系列 -- 29. 按钮组