pinpoint默认是从web界面设置应用告警规则的。pinpoint官方文档中也并未有相关api接口的说明,但操作pinpoint web界面时,按F12打开开发者工具,可观察到其api接口。
假设当前pinpoint web的地址为http://172.31.2.5:8079/,上面有个应用为xmgate:
-
获取应用列表:
GET请求,http://172.31.2.5:8079/applications.pinpoint
其返回json数据,类似于:
[{"applicationName":"xmgate","serviceType":"SPRING_BOOT","code":1210}]
-
获取指定应用的告警规则:
GET请求,http://172.31.2.5:8079/application/alarmRule.pinpoint?applicationId=xmgate
其返回json数据,类似于:
[{"ruleId":"1","applicationId":"xmgate","serviceType":"SPRING_BOOT","checkerName":"HEAP USAGE RATE","threshold":80,"userGroupId":"DevOpsEngineers","smsSend":false,"emailSend":true,"notes":""}]
-
设置告警规则:
POST请求,http://172.31.2.5:8079/application/alarmRule.pinpoint,需携带请求头“Content-Type: application/json”,根据所需设置的告警规则需携带相应载荷,类似于(不同的监控指标可能会有所差异):
{"applicationId":"xmgate","serviceType":"SPRING_BOOT","checkerName":"HEAP USAGE RATE","userGroupId":"DevOpsEngineers","threshold":80,"emailSend":true,"smsSend":false,"notes":""}
其返回json数据,类似于:
{'result': 'SUCCESS', 'ruleId': '35'}
根据上述分析,可编写如下python脚本:
[root@gw5 ~]# cat setAlarm.py #!/usr/bin/env python3 # -*- coding: utf-8 -*-
import sys, json, urllib.request, re
# pinpoint web地址 ppWeb = 'http://172.31.2.5:8079' # pinpoint web中接收告警的用户组 userGroup = 'DevOpsEngineers' # 需设置告警的性能指标名称(mtc)及阈值(tsd)的列表,可按需增加 metricList = [{'mtc':'SLOW RATE','tsd':30}, {'mtc':'ERROR RATE','tsd':30}, {'mtc':'HEAP USAGE RATE','tsd':80}, {'mtc':'JVM CPU USAGE RATE','tsd':80}, {'mtc':'DATASOURCE CONNECTION USAGE RATE','tsd':80}, {'mtc':'FILE DESCRIPTOR COUNT','tsd':10000}]
# 访问pinpoint的函数 def accessPP(Url, Header, Data): url, header, data = Url, Header, Data if not data: request = urllib.request.Request(url) else: request = urllib.request.Request(url, json.dumps(data).encode("utf-8")) if header: for key in header: request.add_header(key, header[key]) try: response = urllib.request.urlopen(request) except Exception as e: print('[ERROR] %s' % e) sys.exit(1) else: return json.loads(response.read( ).decode("utf-8")) finally: if 'response' in vars( ): response.close( )
# 主函数 def main(): # 获取应用列表 url = '%s/applications.pinpoint' % ppWeb header = {} data = {} appList = accessPP(url, header, data) if not appList: print(u'[INFO] pinpoint中未发现有应用!') sys.exit(0) for app in appList: # 获取应用告警规则列表 url = '%s/application/alarmRule.pinpoint?applicationId=%s' % (ppWeb, app['applicationName']) header = {} data = {} alarmRuleList = accessPP(url, header, data) # 若告警规则已存在则跳过,若不存在则进行设置 url = '%s/application/alarmRule.pinpoint' % ppWeb header = {'Content-Type': 'application/json'} for metric in metricList: if re.findall(metric['mtc'], str(alarmRuleList)): print(u'[INFO] 应用程序 "%s" 跳过设置告警规则 "%s"' % (app['applicationName'], metric['mtc'])) continue data = { "applicationId": app['applicationName'], "serviceType": app['serviceType'], "checkerName": metric['mtc'], "userGroupId": userGroup, "threshold": metric['tsd'], "emailSend": "true", "smsSend": "false", "notes": "" } state = accessPP(url, header, data) # 由于pinpoint对传入的参数未做校验,所以基本上返回的都是'SUCCESS',所以下面的判断没啥太大意义,但还是留着备用吧 if state['result'] == 'SUCCESS': print(u'[INFO] 应用程序 "%s" 告警规则设置成功 "%s"' % (app['applicationName'], metric['mtc'])) else: print(u'[ERROR] 应用程序 "%s" 告警规则设置失败 "%s"' % (app['applicationName'], metric['mtc'])) print(u'[INFO] 返回信息 %s' % state) main( ) |
执行脚本:
[root@gw5 ~]# ./setAlarm.py |
该脚本为如下性能指标设置了告警规则(如需增删性能指标或调整阈值,可自行修改脚本中的metricList变量):
SLOW RATE
ERROR RATE
HEAP USAGE RATE
JVM CPU USAGE RATE
DATASOURCE CONNECTION USAGE RATE
FILE DESCRIPTOR COUNT
该脚本对于已设置告警规则的性能指标会跳过设置,若是未设置的则会新增告警规则。也可将该脚本放在Linux服务器的crontab中定时运行,以实现对新增应用自动设置告警规则。