elastic APM初试 - Distributed tracing 和 Real User Monitoring

文章目录


在上一篇文章elasticsearch APM功能全解 一中已经提到过了分布式追踪(Distributed tracing)和真实用户监控(Real User Monitoring).

  • 分布式追踪 : 当我们对一个应用接口进行性能监控时,该接口服务调用可能是由数个分布式的微服务来共同完成。分布式追踪可以监控整个调用链,使开发人员和运营人员能够将各个事务的性能进行上下文分析,从而快速查明最终影响用户体验的瓶颈。
  • 真实用户监控:捕获用户与Web浏览器等客户端的交互。与监视请求和响应的Elastic APM后端agent不同,RUM JavaScript agent监视客户端应用程序中的真实用户体验和交互。

步骤介绍

文章中,我通过一下步骤和配置来体验这两个功能:

  • 修改apm-server.yml,打开RUM功能(Distributed tracing是默认功能)
  • 启动APM server
  • 分别在后端,前端,UI上启动对应的APM agent

我之前写过一个数字货币的回测程序,刚好用来测试。该程序的架构如下:

  • 后端是一个flask的web框架,用于提供数字货币的行情读取和策略回测功能
  • 前端是用Vue编写的界面,可以webpack打包后作为静态文件和资源文件放到flask上,但为了测试node,我直接使用webpack的dev server来启动一个nodejs的web服务
  • UI是用Vue编写的,可以将RUM的启动放到Vue模块加载之前

启动APM server

先下载安装包

curl -L -O https://artifacts.elastic.co/downloads/apm-server/apm-server-6.5.0-darwin-x86_64.tar.gz
tar xzvf apm-server-6.5.0-darwin-x86_64.tar.gz
cd apm-server-6.5.0-darwin-x86_64/

先修改目录下的apm-server.yml,enable RUM

  rum:
    # To enable real user monitoring (RUM) support set this to true.
    enabled: true

启动服务器:

./apm-server -e

监测python后端

先下载flask的客户端

pip install elastic-apm[flask]

因为是侵入式的,需要修改代码:

# or configure to use ELASTIC_APM in your application's settings
from elasticapm.contrib.flask import ElasticAPM
app.config['ELASTIC_APM'] = {
  'SERVICE_NAME': 'python-backend',
}

apm = ElasticAPM(app)

注意,这里如果不配置SERVICE_NAME,仍然可以正常启动,但是在APM Server端看不到任何数据

启动后,会看到如下信息(包含很多的Instrumented打印,表示agent开始监测对应的库活动):

2019-01-28 17:16:27,267 _internal.py[line:88] INFO  * Running on http://0.0.0.0:80/ (Press CTRL+C to quit)
2019-01-28 17:16:27,268 _internal.py[line:88] INFO  * Restarting with stat
2019-01-28 17:16:28,522 base.py[line:83] DEBUG Instrumented django_template, django.template.Template.render
2019-01-28 17:16:28,523 base.py[line:74] DEBUG Skipping instrumentation of psycopg2-register-type. Module psycopg2.extensions not found
2019-01-28 17:16:28,523 base.py[line:74] DEBUG Skipping instrumentation of psycopg2-register-type. Module psycopg2._json not found
2019-01-28 17:16:28,524 base.py[line:74] DEBUG Skipping instrumentation of cassandra. Module cassandra.cluster not found
2019-01-28 17:16:28,524 base.py[line:74] DEBUG Skipping instrumentation of botocore. Module botocore.client not found
2019-01-28 17:16:28,525 base.py[line:74] DEBUG Skipping instrumentation of mysql. Module MySQLdb not found
2019-01-28 17:16:28,525 base.py[line:83] DEBUG Instrumented requests, requests.sessions.Session.send
2019-01-28 17:16:28,528 base.py[line:74] DEBUG Skipping instrumentation of sqlite. Module pysqlite2.dbapi2 not found
2019-01-28 17:16:28,528 base.py[line:83] DEBUG Instrumented sqlite, sqlite3.connect, sqlite3.dbapi2.connect
2019-01-28 17:16:28,564 base.py[line:83] DEBUG Instrumented pymongo, pymongo.bulk.BulkOperationBuilder.execute
2019-01-28 17:16:28,571 base.py[line:83] DEBUG Instrumented redis, redis.client.BasePipeline.execute
2019-01-28 17:16:28,571 base.py[line:74] DEBUG Skipping instrumentation of pylibmc. Module pylibmc not found
2019-01-28 17:16:28,572 base.py[line:83] DEBUG Instrumented jinja2, jinja2.Template.render
2019-01-28 17:16:28,572 base.py[line:74] DEBUG Skipping instrumentation of python_memcached. Module memcache not found
2019-01-28 17:16:28,573 base.py[line:74] DEBUG Skipping instrumentation of pyodbc. Module pyodbc not found
2019-01-28 17:16:28,573 base.py[line:83] DEBUG Instrumented pymongo, pymongo.collection.Collection.aggregate, pymongo.collection.Collection.bulk_write, pymongo.collection.Collection.count, pymongo.collection.Collection.create_index, pymongo.collection.Collection.create_indexes, pymongo.collection.Collection.delete_many, pymongo.collection.Collection.delete_one, pymongo.collection.Collection.distinct, pymongo.collection.Collection.drop, pymongo.collection.Collection.drop_index, pymongo.collection.Collection.drop_indexes, pymongo.collection.Collection.ensure_index, pymongo.collection.Collection.find_and_modify, pymongo.collection.Collection.find_one, pymongo.collection.Collection.find_one_and_delete, pymongo.collection.Collection.find_one_and_replace, pymongo.collection.Collection.find_one_and_update, pymongo.collection.Collection.group, pymongo.collection.Collection.inline_map_reduce, pymongo.collection.Collection.insert, pymongo.collection.Collection.insert_many, pymongo.collection.Collection.insert_one, pymongo.collection.Collection.map_reduce, pymongo.collection.Collection.reindex, pymongo.collection.Collection.remove, pymongo.collection.Collection.rename, pymongo.collection.Collection.replace_one, pymongo.collection.Collection.save, pymongo.collection.Collection.update, pymongo.collection.Collection.update_many, pymongo.collection.Collection.update_one
2019-01-28 17:16:28,574 base.py[line:83] DEBUG Instrumented redis, redis.client.Redis.execute_command, redis.client.StrictRedis.execute_command
2019-01-28 17:16:28,574 base.py[line:83] DEBUG Instrumented elasticsearch, elasticsearch.client.Elasticsearch.delete_by_query, elasticsearch.client.Elasticsearch.search, elasticsearch.client.Elasticsearch.count, elasticsearch.client.Elasticsearch.update
2019-01-28 17:16:28,575 base.py[line:83] DEBUG Instrumented elasticsearch_connection, elasticsearch.connection.http_urllib3.Urllib3HttpConnection.perform_request, elasticsearch.connection.http_requests.RequestsHttpConnection.perform_request
2019-01-28 17:16:28,575 base.py[line:74] DEBUG Skipping instrumentation of pymssql. Module pymssql not found
2019-01-28 17:16:28,575 base.py[line:74] DEBUG Skipping instrumentation of psycopg2. Module psycopg2 not found
2019-01-28 17:16:28,576 base.py[line:83] DEBUG Instrumented django_template_source, django.template.base.Parser.extend_nodelist
2019-01-28 17:16:28,576 base.py[line:83] DEBUG Instrumented pymongo, pymongo.cursor.Cursor._refresh
2019-01-28 17:16:28,576 base.py[line:83] DEBUG Instrumented urllib3, urllib3.connectionpool.HTTPConnectionPool.urlopen, requests.packages.urllib3.connectionpool.HTTPConnectionPool.urlopen
2019-01-28 17:16:28,612 base.py[line:159] INFO Scheduler started
2019-01-28 17:16:28,612 base.py[line:926] DEBUG Looking for jobs to run

监测NodeJs后端

例子里的NodeJS web server仅仅用来提供resource文件(静态的图片、html、js等)。这里,我使用的是Vue。Vue的脚手架会给我们配置好webpack。使用npm dev命令会给我们启动一个web server。我们需在web server的启动代码里面侵入agent。

首先,在依赖里面添加nodejs agent:

npm install elastic-apm-node --save

然后,修改webpack-dev-server.js

var apm = require('elastic-apm-node').start({
  serviceName: 'web-dev-server',
})

elastic APM初试 - Distributed tracing 和 Real User Monitoring

监测用户行为

这部分的agent需要嵌入到用户UI上。
首先,先下载对应的agent

npm install elastic-apm-js-base --save

然后,嵌入代码,在整个UI的启动阶段,启动RUM agent:

var apm = initApm({

  // Set required service name (allowed characters: a-z, A-Z, 0-9, -, _, and space)
  serviceName: 'front-end RUM'
})

这里,我用的是Vue,应该是放在main.js里面:
elastic APM初试 - Distributed tracing 和 Real User Monitoring
这时可以启动了:

npm dev

APM UI

这时,我们可以在Kibana的APM UI上看到我们的三个agent
elastic APM初试 - Distributed tracing 和 Real User Monitoring
从理论上来看,当我们在页面上进行刷新或点击的操作时,会触发一个APM的记录点,这时一个根节点,会从页面刷新开始,到请求资源文件,解析资源文件,再到访问后端数据。整个distributed tracing的路径是从:front-end RUM -> web-dev-server -> python-backend。

我们来看看:
elastic APM初试 - Distributed tracing 和 Real User Monitoring
果然是一个beta版本。。。从trace里面看,各个service是割裂的

小结

从目前的简单测试来看,distributed trace功能还是有些问题,还不能完全将各个步骤连接起来。
而RUM,目前的transaction也只包含了page load的类型,而没有用户点击(click),选择(select)的记录:
elastic APM初试 - Distributed tracing 和 Real User Monitoring
而关于page load,这个timeline的功能我感觉是比较鸡肋的,因为并不比chrome的timeline有更多有意义的信息,实际上,我们不可能把这个东西真正的用户端

elastic APM初试 - Distributed tracing 和 Real User Monitoring
因此,elastic APM还是把这两个功能完善才能产生有用的价值

上一篇:MongoDB 索引


下一篇:Linux运维工程师面试题大全09_文本处理三剑客之SED