1.简介
随着我们对性能测试的需求越来越多,分布式压力测试越来越迫切,我们测试环境的kubernetes集群资源比较空闲,而压力测试大多数时候都是一次性的或者临时的,如果能用kubernetes的集群做压力测试,压完后回收资源,就能更充分地利用测试环境集群资源。
以下分布式测试方案,以Jmeter的主从结构为基础,利用kubernetes集群做分布式压测,所有测试资源一次性创建,测完即可回收,可以根据需要选择压测的实例数,基于jmeter的的测试方案格式,一个命令即可执行分布式测试。
界面示例:
2.架构
以jmeter的主从架构为基础,在kubernetes集群内创建jmeter实例,压测数据收集到influxDB中,被压目标通过telegraf传输到influxDB,最终通过grafana展示出来。
压测完成后,jmeter,influxDB,grafana的资源销毁即可。
3.用法
3.1 前提
需要有一个kubernetes集群,可以是一个单独的namespace,在服务器上可以执行kubectl命令,同时,需要有influxDB,grafana的镜像(如果没有,可以通过互联网导入,或者自己打镜像)
3.2 初次准备工作
3.2.1 准备镜像
假设以下镜像存在
grafana/grafana:5.0.1
influxdb:1.1.0
openjdk:8
3.2.2 准备jmeter5.4.1
解压后修改jmeter.properties,修改下面的参数,原先是false,改成true
# Feature is disabled by default (0) due to known and not fixed bugs:
server.rmi.ssl.disable=true
3.2.3 制作jmeter镜像
Master镜像
FROM openjdk:8
COPY apache-jmeter-5.4.1 /apache-jmeter-5.4.1
ENTRYPOINT ["/apache-jmeter-5.4.1/bin/jmeter","-n","-t", "/testplan.jmx", "-r"]
Slave镜像
FROM openjdk:8
COPY apache-jmeter-5.4.1 /apache-jmeter-5.4.1
ENTRYPOINT ["/apache-jmeter-5.4.1/bin/jmeter-server"]
最终,我们需要三个镜像,此处在我的仓库里三个镜像是:
grafana/grafana:5.0.1
influxdb:1.1.0
192.168.174.50:5000/jmetermaster:2.0
192.168.174.50:5000/jmeter:2.0
以上镜像准备好了,以后就不需要再打镜像了。
3.3 压测步骤
3.3.1 启动influxdb
有个influxdb目录,里面是配置和启动脚本
k8s@kube-master1:~/dspt/influxdb$ tree
.
├── clean.sh
├── deploymentInfluxDb.sh
├── influxdb_deployment.yml
└── influxdb_service.yml
可以根据实际情况修改里面的地址和端口
deployment可能需要修改的是镜像地址,这里默认用了dockerhub。
apiVersion: apps/v1
kind: Deployment
metadata:
name: influxdb-deployment
spec:
replicas: 1
selector:
matchLabels:
app: influxdb
template:
metadata:
labels:
app: influxdb
spec:
containers:
- name: influxdb
image: influxdb:1.1.0
ports:
- containerPort: 8086
containerPort: 8083
service可能需要修改的是nodeport,此处分别用了30001和30002。
apiVersion: v1
kind: Service
metadata:
name: influxdb-service
spec:
type: NodePort
selector:
app: influxdb
ports:
- name: http1
protocol: TCP
port: 8086
targetPort: 8086
nodePort: 30001
- name: http2
protocol: TCP
port: 8083
targetPort: 8083
nodePort: 30002
调用deploymentInfluxDb.sh,这个脚本里的IP地址,根据集群的地址情况做修改。
#清理influxdb相关配置
echo ‘清理influxdb相关配置‘
kubectl delete service influxdb-service
kubectl delete deployment influxdb-deployment
echo ‘创建influxdb服务‘
kubectl create -f influxdb_deployment.yml
kubectl create -f influxdb_service.yml
echo ‘创建成功, 数据库接口30001,WebUI端口30002‘
echo ‘等待10s‘
sleep 10
echo ‘创建数据库jmeter‘
curl -X POST ‘http://192.168.174.51:30001/query?q=create+database+%22jmeter%22&db=_internal‘
echo ‘创建数据库telegraf‘
curl -X POST ‘http://192.168.174.51:30001/query?q=create+database+%22telegraf%22&db=_internal‘
启动后效果如下:
k8s@kube-master1:~/dspt/influxdb$ sh deploymentInfluxDb.sh
清理influxdb相关配置
Error from server (NotFound): services "influxdb-service" not found
Error from server (NotFound): deployments.apps "influxdb-deployment" not found
创建influxdb服务
deployment.apps/influxdb-deployment created
service/influxdb-service created
创建成功, 数据库接口30001,WebUI端口30002
等待10s
创建数据库jmeter
{"results":[{}]}
创建数据库telegraf
{"results":[{}]}
在浏览器里打开WebUI,
3.3.2 在目标机器部署telegraf
使用版本1.7.3,需要修改telegraf里的influxdb的地址
k8s@kube-master1:~/dspt/telegraf-1.17.3/usr/bin$ tree
.
├── telegraf
└── telegraf.conf
修改配置文件,写入地址和端口
[[outputs.influxdb]]
retention_policy = ""
urls = ["http://192.168.174.51:30001"]
启动telegraf
k8s@kube-master1:~/dspt/telegraf-1.17.3/usr/bin$ ./telegraf --config telegraf.conf
2021-03-08T07:53:12Z I! Starting Telegraf 1.17.3
2021-03-08T07:53:13Z I! Loaded inputs: cpu disk diskio kernel mem net netstat processes swap sysstat system
2021-03-08T07:53:13Z I! Loaded aggregators:
2021-03-08T07:53:13Z I! Loaded processors:
2021-03-08T07:53:13Z I! Loaded outputs: influxdb
2021-03-08T07:53:13Z I! Tags enabled: host=kube-master1
2021-03-08T07:53:13Z I! [agent] Config: Interval:15s, Quiet:false, Hostname:"kube-master1", Flush Interval:15s
3.3.3 启动grafana
grafana也有个目录
k8s@kube-master1:~/dspt/grafana$ tree
.
├── clean.sh
├── deployGrafana.sh
├── grafana_deployment.yml
└── grafana_service.yml
视镜像地址和端口情况修改配置
apiVersion: apps/v1
kind: Deployment
metadata:
name: grafana-deployment
spec:
replicas: 1
selector:
matchLabels:
app: grafana
template:
metadata:
labels:
app: grafana
spec:
containers:
- name: grafana
image: grafana/grafana:5.0.1
ports:
- containerPort: 3000
apiVersion: v1
kind: Service
metadata:
name: grafana-service
spec:
type: NodePort
selector:
app: grafana
ports:
- protocol: TCP
port: 3000
targetPort: 3000
nodePort: 30003
启动grafana
k8s@kube-master1:~/dspt/grafana$ sh deployGrafana.sh
Error from server (NotFound): services "grafana-service" not found
Error from server (NotFound): deployments.apps "grafana-deployment" not found
deployment.apps/grafana-deployment created
service/grafana-service created
在Web打开,用户名和密码是admin和admin
在add data source里添加两个数据源
第一个,名字是DS_BLZAEDEMO,为了使用模板。
第二个,名字DS_TELEGRAF,为了用telegraf的模板。
导入两个dashboard模板,一个是jmeter.json,一个是telegraf.json。
![image-20210308160503493]
这时候我们可以看到,收集目标机器的数据已经有了。
jmeter的数据还没有
3.3.4 启动压测
我们的压测是基于jmeter的,需要使用jmeter的测试计划,现在jmeter的图形界面里生成一个jmeter测试计划,我们制定一个简单的测试计划,需要给这个计划增加一个后端监听器,按如下规则配置,用于把测试数据传到influxdb。
我们保存测试计划为testplan.jmx,开始执行压测
jmeter目录里是启动脚本和配置,根据具体镜像做调整。
使用deployment.sh启动测试,传入两个参数,一个是测试计划文件,一个是要启动的实例数,我们启动两个实例进行测试
k8s@kube-master1:~/dspt/jmeter$ sh deployment.sh ../testplan.jmx 2
$清理上次测试
Error from server (NotFound): pods "jmetermaster" not found
Error from server (NotFound): configmaps "testplan" not found
Error from server (NotFound): configmaps "jmetermasterconfig" not found
Error from server (NotFound): deployments.apps "jmeter-deployment" not found
rm: cannot remove ‘jmeter_deployment.yml‘: No such file or directory
rm: cannot remove ‘jmeter.properties‘: No such file or directory
deployment.apps/jmeter-deployment created
创建2个jmeter slave
jmeter-deployment-5c6f57cfdd-c2bcp 1/1 Running 0 10s
jmeter-deployment-5c6f57cfdd-w9gn4 1/1 Running 0 10s
configmap/testplan created
configmap/jmetermasterconfig created
pod/jmetermaster created
Mar 08, 2021 8:17:10 AM java.util.prefs.FileSystemPreferences$1 run
INFO: Created user preferences directory.
Creating summariser <summary>
Created the tree successfully using /testplan.jmx
Configuring remote engine: 10.244.1.41
Configuring remote engine: 10.244.2.24
Starting distributed test with remote engines: [10.244.1.41, 10.244.2.24] @ Mon Mar 08 08:17:11 UTC 2021 (1615191431009)
Remote engines have been started:[10.244.1.41, 10.244.2.24]
Waiting for possible Shutdown/StopTestNow/HeapDump/ThreadDump message on port 4445
可以看现在的集群pod情况
k8s@kube-master1:~/dspt/jmeter$ kubectl get pods
NAME READY STATUS RESTARTS AGE
grafana-deployment-59fb8479d-26xms 1/1 Running 0 22m
influxdb-deployment-cb9486f67-4srm9 1/1 Running 0 31m
jmeter-deployment-5c6f57cfdd-c2bcp 1/1 Running 0 100s
jmeter-deployment-5c6f57cfdd-w9gn4 1/1 Running 0 100s
jmetermaster 1/1 Running 0 89s
查看grafana,可以看到有压测数据了。
3.3.5 停止测试,销毁资源
我们现在用的资源,都是临时的
k8s@kube-master1:~/dspt/jmeter$ kubectl get pods
NAME READY STATUS RESTARTS AGE
grafana-deployment-59fb8479d-26xms 1/1 Running 0 26m
influxdb-deployment-cb9486f67-4srm9 1/1 Running 0 36m
jmeter-deployment-5c6f57cfdd-c2bcp 1/1 Running 0 6m12s
jmeter-deployment-5c6f57cfdd-w9gn4 1/1 Running 0 6m12s
jmetermaster 1/1 Running 2 6m1s
要停止测试,只需要把资源销毁即可。
调用jmeter目录里的stop.sh
k8s@kube-master1:~/dspt/jmeter$ sh stop.sh
pod "jmetermaster" deleted
configmap "testplan" deleted
configmap "jmetermasterconfig" deleted
deployment.apps "jmeter-deployment" deleted
调用influxdb目录里的clean.sh
k8s@kube-master1:~/dspt/influxdb$ sh clean.sh
deployment.apps "influxdb-deployment" deleted
service "influxdb-service" deleted
调用grafana里的clean.sh
k8s@kube-master1:~/dspt/grafana$ sh clean.sh
service "grafana-service" deleted
deployment.apps "grafana-deployment" deleted
这样资源就全都释放了,另外,停止宿主机上的telegraf。
后续工作
1 . 目前仅仅是通过shell脚本启动,有些操作不够自动化,需要完善功能,把整个过程做到一键启动,使用者只需要关注测试计划
2 . 定制测试计划模板,把常用的模板固定下来,这样使用者只需要修改参数即可