aws xray通过设置采样规则对请求进行过滤

参考资料

  • https://github.com/aws/aws-xray-sdk-python
  • python api reference,https://docs.aws.amazon.com/xray-sdk-for-python/latest/reference/
  • node api reference,https://docs.aws.amazon.com/xray-sdk-for-nodejs/latest/reference/

初始化环境

npm init -y
npm install aws-xray-sdk
npm intall aws-sdk
npm install express

使用nodejs结合express的server来测试此配置,代码如下

// app.js
var AWSXRay = require('aws-xray-sdk');
// xray将aws sdk包装
var AWS = AWSXRay.captureAWS(require('aws-sdk'));
// var AWS = require('aws-sdk');

// AWSXRay.config([AWSXRay.plugins.EC2Plugin, AWSXRay.plugins.ElasticBeanstalkPlugin]);
AWS.config.update({ region: 'cn-north-1' });

// 指定xray守护进程监听地址
AWSXRay.setDaemonAddress('127.0.0.1:2000');

const express = require('express')
const app = express()
const port = 3000

app.use(AWSXRay.express.openSegment('TestPathApp'));

app.get('/', (req, res) => {
    var document = AWSXRay.getSegment();
    // 添加注释和元数据
    document.addAnnotation("mykey", "my value");
    document.addMetadata("my key", "my value", "my namespace");
    res.send('Hello World!')
    var s3 = new AWS.S3();
    var params = {};
    s3.listBuckets(params, function (err, data) {
        if (err) console.log(err, err.stack);
        else console.log(data);
    });
})

app.get('/testpath', (req, res) => {
    res.send('test path!')
})

app.use(AWSXRay.express.closeSegment());

app.listen(port, () => {
    console.log(`Example app listening on port ${port}`)
})

运行服务器

node main.js

配置采样规则的参考文档

https://docs.aws.amazon.com/zh_cn/xray/latest/devguide/xray-console-sampling.html

适用于 Node.js 的 X-Ray 配置

采样规则的配置方式

xray在ec2平台上配置xray采样规则,可以控制使用xray-sdk的服务器行为

在service侧设置采样规则的目的在于,不修改或重新部署代码的情况下修改采样行为,控制记录的数据量。

默认的采样规则

  • 每秒记录首次请求
  • 任何其他请求的5%

采样规则的配置方式

  • 在sdk中配置从json中读取采样规则(存在的问题有,每个实例使用单独的采样规则可能导致请求数量变高,更新采样规则需要重新运行代码)

    本地定义规则的缺点,固定目标由记录器的每个实例独立应用,而不是由 X-Ray 服务管理。随着部署更多主机,固定速率会成倍增加,这使得控制记录的数据量变得更加困难

  • 在控制台创建采样规则,并在sdk中配置读取xray服务中定义好的采样规则(不需要重新部署代码,请求数量不会额外升高)

    X-Ray SDK 需要额外配置才能使用您在控制台中配置的采样规则。

    如果 SDK 无法访问 X-Ray 获取采样规则,它将恢复为每秒第一个请求的默认本地规则(1,5%)

采样规则的内容

默认的规则如下

  • 每秒采样1个,固定比率为5%
  • 匹配所有的满足要求的规则*

在这里插入图片描述

设置采样名称和优先级

设置采样限制

  • reservoir,每秒钟请求的采样数量
  • fixed rate,超出reservoir之后,对额外请求的百分比
  • 举例,存储器容量为50,百分比为10%,如果总体请求为100,则每秒钟的采样数量为50+(100-50)*10%=55

设置采样标准

选择面向服务的条件,以确定要匹配的请求。值可以包括多字符匹配通配符(*)或单字符匹配通配符(?)

在这里插入图片描述

我们创建的服务器接受请求后,向xray发送的数据的原始raw data(先将ec2的插件关闭)如下

{
    "Id": "1-645b3731-14c36c71cafd66c95e9ce301",
    "Duration": 0.001,
    "LimitExceeded": false,
    "Segments": [
        {
            "Id": "36cdc1612a565cbf",
            "Document": {
                "id": "36cdc1612a565cbf",
                "name": "TestPathApp",
                "start_time": 1683699504.526,
                "trace_id": "1-645b3731-14c36c71cafd66c95e9ce301",
                "end_time": 1683699504.527,
                "http": {
                    "request": {
                        "url": "http://127.0.0.1:3000/testpath",
                        "method": "GET",
                        "user_agent": "curl/7.88.1",
                        "client_ip": "::ffff:127.0.0.1"
                    },
                    "response": {
                        "status": 200
                    }
                },
                "aws": {
                    "xray": {
                        "package": "aws-xray-sdk",
                        "rule_name": "Default",
                        "sdk_version": "3.5.0",
                        "sdk": "X-Ray for Node.js"
                    }
                },
                "service": {
                    "name": "unknown",
                    "version": "unknown",
                    "runtime": "node",
                    "runtime_version": "v16.15.0"
                }
            }
        }
    ]
}

在控制台配置采样规则

其中代码中的TestPathApp就是服务名称

app.use(AWSXRay.express.openSegment('TestPathApp'));

在sdk配置采样规则

创建规则json文件

cat sampling-rules.json
{
  "version": 2,
  "rules": [
    {
      "description": "Player moves.",
      "host": "*",
      "http_method": "*",
      "url_path": "/api/move/*",
      "fixed_target": 1,
      "rate": 0.05
    }
  ],
  "default": {
    "fixed_target": 1,
    "rate": 0.1
  }
}

在sdk中引用即可

// AWSXRay.middleware.setSamplingRules('sampling-rules.json');
AWSXRay.setDaemonAddress('127.0.0.1:2000');
var rules = {
    "rules": [
        {
            "description": "Player moves.",
            "service_name": "*",
            "http_method": "*",
            "url_path": "/api/move/*",
            "fixed_target": 1,
            "rate": 0.05
        }
    ],
    "default": {
        "fixed_target": 0,
        "rate": 0.1
    },
    "version": 1
}

我们重新启动server,并监控xray守护进程的日志

var logger = {
    error: (message, meta) => { console.log(message, meta); },
    warn: (message, meta) => { console.log(message, meta); },
    info: (message, meta) => { console.log(message, meta); },
    debug: (message, meta) => { console.log(message, meta); }
}

AWSXRay.setLogger(logger);

当请求/testpath时,日志如下

  • 由于没有匹配的centralized sampling rule,因此使用了本地采样方法
  • segment的元数据
No effective centralized sampling rule match. Fallback to local rules. undefined
Local sampling rule match found for { http_method: GET, host: 127.0.0.1:3000, url_path: /testpath }. Matched { http_method: *, host: *, url_path: /testpath }. Using fixed_target: 1 and rate: 0.05. undefined

Starting middleware segment: { url: /testpath, name: TestPathApp, trace_id: 1-645b4739-390fcc9ee953490557b0b920, id: dac471632ef967d6, sampled: true } undefined
Closed middleware segment successfully: { url: /testpath, name: TestPathApp, trace_id: 1-645b4739-390fcc9ee953490557b0b920, id: dac471632ef967d6, sampled: true } undefined

Segment sent: {"trace_id:"1-645b4739-390fcc9ee953490557b0b920","id":"dac471632ef967d6"} undefined
UDP message sent: {"trace_id":"1-645b4739-390fcc9ee953490557b0b920","id":"dac471632ef967d6","start_time":1683703609.304,"name":"TestPathApp","service":{"runtime":"node","runtime_version":"v16.15.0","version":"unknown","name":"unknown"},"aws":{"xray":{"sdk":"X-Ray for Node.js","sdk_version":"3.5.0","package":"aws-xray-sdk"}},"http":{"request":{"method":"GET","user_agent":"curl/7.88.1","client_ip":"::ffff:127.0.0.1","url":"http://127.0.0.1:3000/testpath"},"response":{"status":200}},"end_time":1683703609.328} undefined
Successfully refreshed centralized sampling rule cache. undefined

但请求/时,日志如下

Rule Default is matched. undefined

Starting middleware segment: { url: /, name: TestPathApp, trace_id: 1-645b477b-ba767934aabd33cbda7c5455, id: a54f43f3efcd4f89, sampled: false } undefined
Closed middleware segment successfully: { url: /, name: TestPathApp, trace_id: 1-645b477b-ba767934aabd33cbda7c5455, id: a54f43f3efcd4f89, sampled: false } undefined

Successfully reported rule statistics to get new sampling quota. undefined

可见,由于采样规则的设置,我们成功控制了数据收集量,避免额外的开销(例如lb健康检查的请求)

上一篇:【LeetCode】【算法】21. 合并两个有序链表


下一篇:排序算法(2)