Testing & Paper reading——Sketchvisor

Paper——SIGCOMM 17 : Reading SketchVisor Robust Network Measurement for Sofeware Packet Processing

A Partial User-space Implementation of SketchVisor's Logoc using Python and Mininet.

 

一、Introduction

SketchVisor is an paper presented in SIGCOMM17 and suggests a method to perform common network measurement task such as Heavy-Hitter-Detection, Heavy-Changers, DDos ,Cardinality基数, Flow-Size-Distribution流量大小分布, etc. While minding Performance, Resource efficiency, Accuracy(不基于采样)and Generality(支持各种基于sketch的解决方案)。

SketchVisor是一种基于纯软件包转发的网络测量框架,并改进现有算法提出了两种算法。此网络测量框架具有高性能(line-rate)、高精度、广泛性(适用于多种sketch算法)、自动化(自动掉结负载)的特点。框架包括数据平面和控制平面,每个软件交换机对应一个数据平面,每个数据平面包括normal path和fast path。一旦流量过载,SketchVisor就将过载的额流量重定向到Fast path,以保持高性能和高精度(虽然有轻微损失),其中在Fast path 设计了Top-k算法。有一个总的控制平面,设计Compressing Sensing算法,对分布的交换机提供的数据加以整合,恢复出全网的数据。最后实验证明,SktechVisor使一系列基于sketch的测量方法达到流高性能和高精度。

二、Background

Terminopogy

  • Epoches: one or multiple time periods;
  • Traffic statistic(流量统计): can be either flow-based(identified by 5-tuples) or host-based(identified by IP addresses) or either volume-based基于卷(measured by byte counts 通过字节数衡量) or connectivity-based基于连接(measured by distinct flow/host counts通过不同的流/主机数衡量);
  • Sketch: At a hogh level, a sketch is a compact data structure comprising a set of buckets, each of which is associated with one or multiple counters.It maps each pakect to a subset of buclets with independent hash functions, and updates the counters of those buckets.Network operators can query the counter values to recover traffic statistics.

Measurement Tasks

  • Network measurement tasks includes: monitor traffic and collect traffic statistic and some network attack;
  • Attack: heavy hitter, heavy changer, DDos , Superspreader, Cardinality, Flow size distribution, ENtropy熵.

Performance Flaws 性能缺陷

  • Sketches are only primitives that cannot be directily used for network measurement;
  • In order to collect meaningful traffic statistics,we must add extension to sketches to make them reversible,meaning that sketches not only store statistics,but also efficientlty answer queries on the statistics;
  • Although sketches are efficiently designed,applying them in network mteasurement inveitably incurs必然导致 heavy computational overhead大量的计算开销;
  • Sketches are compact data structures that can summarize traffic statistics of all packets with fixed-size memory固定大小内存, while incurring only bounded errors有限的错误.

Microbenchmark 微基准

  • Providing comparation of exsited method:

Testing & Paper reading——Sketchvisor

三、Problem

  • Existing sketch-based measurement solutions suffer from severe performance drops under high traffic load在高流量的负载下性能严重下降。
  • Heavy computational overhead:existing representative sketch-based solutions in software actually consume substantial CPU resources additional extensions or components that oftem incur heavy computations.
  • Optimizing soecific functions优化特定功能(比如,使用基于硬件的哈希算法)may not work well for akk sketch-based solutions.

四、Goals

Our Goal

Being able to track as many flows as possible simultaneously, to perform the above tasks.

Testing & Paper reading——Sketchvisor

 

 

Design Goals

  • Perfomance: It processes packets at high speed and aims to fulfill the line-rate requirement of the underlying packets processing pipeline满足基础数据包处理管道的线速要求.
  • Resource efficiency: It efficiently utilizes CPU for packet processing and memory for data structure有效利用CPU进行数据包处理并利用内存进行数据结构。
  • Accuracy: 保留了sketch的高测量精度。
  • Generality: 支持广泛的基于sketch测量的任务。
  • Simplicity: : It automatically mitigates the processing burdens of sketch-based measurement tasks under high traffic load自动减轻流在高速流量负载下基于sketch的测量任务的处理了负担, without requiring manual per-host configurations and result aggregations by network operators结果由网络运营商汇总.

五、How does it work?

SketchVisor utilizes principles laid by Misra-Gires Algorithm by introducing a "Fast Path " which tracks major flows, while being able to restore恢复 the full sketch at the end of the process.

The following diagram shows the main architecture of SketchVisor.Note that figure(a) is being run on each host, while figure(b) is the single control-plane.

Testing & Paper reading——Sketchvisor

 (b) SketchVisor control plane

 

Date Plane

  • Each host possess a data plane, data plane can choose monitor ingress or egress traffic in case duplicated count.
  • Data plane has two path, one is Normal path and another is  Fast path, when buffer is full,the SketchVisor intructs the software swicth to redirect overflowed packets to the fast path重新指引溢出的数据包到快速路径.
  • They don't consider any proactive approach主动的方法 that examines packets and decide which packets should ba dispatched into either the normal path or the fast path, as it will incur non-trival overhead会产生非常规的开销.
  • The Fast path is less accurate than the Normal path.
  • The Fast path should satisfy: fast enough to absorb all redirected traffic足够快速的吸收所有重定向的流量; Highly accurate although slightly degrade from original sketch-based measurement 尽管从原始的基于sketch的测量精度略有降低,但是仍具有很高的准确性; General for various statics because each statics paobably redirect into Fast path对各种静态值而言通常如此,因为每个静态值都可能重定向到快速路径中.

Control Plane

  • The Control plane collects each switch's results and merges them to provide network-wide measurement.
  • The Control plane shold satisgy: eliminate the extra errors due to fast fast path消除由于快速路径而产生的额外错误(the error should onle come from sketches themselves); must be general to accommadate various measurement tasks必须通用以适应各种测量任务.

SketchVisor

  • Two algorithmic solutions, one builts on counter-based algorithms while the second builds on compressive sensing to design a network-wide recovery algorithm第二种是基于压缩感测来设计网络范围的恢复算法.

Fast Path

  • To avoid the measurement failed and keep accuracy, Sketchvisor redirects overflow traffic into Fast Path.
  • Design top-k algorithm which builds on Misra-Gries’s top-k algorithm for fast path.
  • First, in order to kick out a small flow and add a(potentially) large flow,it performs执行 O1ko operations to update k counters in a hash table; the overhead becomes significant when there are many small flows to kick out 当有许多小流要被踢出去时,开销将会变得很可观.
  • Second, it has loose bounds on the estimated values of the top-k flows对前K个流量的估计值具有宽松的界限. To overcome both limitations, we combine the idea of probabilistic lossy counting (PLC) 概率损失计数的概念, a probabilistic algorithm that improves accuracy for tracking skewed data一种概率性算法,可以提高跟踪偏斜数据的准确性 ,with Misra-Gries’s algorithm.
  • Specifically, we kick out multiple small flows each time, obviating the need of performing O1ko counter update operations for kicking out each flow 消除踢出每个流时执行O1ko计数器更新操作的需求 (i.e., we amortize the operations over multiple kick-outs).
  • Also, instead of using one counter per flow, we carefully associate three counters with each flow to provide tight per-flow lower and upper bounds 我们仔细地将三个计数器与每个流相关联,以提供严格的每流上下限.

Compressive Sensing 压缩感测

  • Use Compressive sensing to recover network-wide statistics网络范围的统计信息.

Explanation

Once the regular(Normal-path) sketch is unable to keep up with the rate of incoming packets, they're being sent to the Fast Path.The Fast Path is the "twist" upon Misra-Gries algorithm, the key differences are:

  1. Instead of <key,count> Pair foe each flow, we store <key, byte-count>*;
  2. Instead of increasing the counter by 1 when a new item arrives, we increase by it's byte-count.

*Note: we actually store 3 counters: <key, e_f, r_f, d_f>    (e_f: the maximum possible byte count that can be missed before f is inserted 在插入f之前可能会丢失的最大字节数.   r_f: the residual byte count of f.  d_f: the decremented byte count after f is inserted 插入f之后减少的字节数)

BUT...what happens when the table is full, and a new item arrive?

  1. In Misra-Gries Algorithm we subtract all counters by 1.
  2. In Fast Path we subtract by some e threshold 我们减去一些e阈值.

Consider all existing flows in the table, |k|, SORTED by size, with the addition of the new flow, so that's |k+1| items: Desired e needs to be at least larger than the smallest flow in the table, in order to be able to take out any existing item. But, e shouldn't be too big to avoid taking out the rest of the items. We could've set e=a[k+1] and simply take out the smallest item, instead, the paper suggests a method allowing us to take out more than a single flow at a time 该文章介绍流一种方法允许我们一次能取出多个流, calculating a "smart" value e that ensures probability of some small delta=0.05 to take out an item bigger than the minimal one 计算出一个“智能”值e以确保某个小的delta=0.05,可以取出比最小值更大的项。

Addressing Data-Loss on Fast-Path 解决快速路径上的数据丢失

By tracking the larger, dominating flows on the Fast-Path, we allegedly lose information about smaller flows which are necessary for some tasks丢失了一些任务所必须的关于较小流量的信息, for example, DDoS-Detection.

The paper suggests a way of reconstructing the "original" sketch as if all flows we're being tracked, by formulating a convex optimization problem通过构造一个凸优化问题. This project does not implement the convex-optimization and therefore unable to accuratly answer tasks that are small-flow dependant依赖小流量(like DDoS detection).

六、Implementation

For my implementation, a Mininet VM is used. Like the experiment in the paper, it consists of a single switch topology with 9 hosts. The implementation is done in Python and relies on Mininet's Python API, and Consists of Modules: PacketCapture Module, NormalPath, Buffer, FastPath, ControlPlane. Like the paper suggests, inter-process communication is done via ZeroMQ. In order to run, execute Start.py. The following steps will occur:

  1. Initialize Network topology, setup virtual hosts and switch.
  2. Configure switch.
  3. Deploy NormalPath.py, Buffer.py, FastPath.py Modules on each host.
  4. Deploy ControlPlane.py on the Switch, and launch xterm for the user to watch the results LIVE启动xterm供用户实时观看结果.
  5. Start injecting dummy traffic in the network, generate several flows between each pair of hosts (h_i, h_j) | i!=j
  6. Monitor the data rate of each flow in the xterm window that's opened, which shows our ability to detect heavy hitters as an example.

Data Plane( Runs on each Host)

Testing & Paper reading——Sketchvisor

 

Control Plane( Runs on Swithch)

 Testing & Paper reading——Sketchvisor

 

CapturePackets.py: captures from selected interface via Pcapy library, parse the packet header and content to extract a  5-Tuple Flow-ID of the form: (SRC_IP, DST_IP, SRC_PORT, DST_PORT, PROTOCOL), then forwards them via ZeroMQ socket to Buffer.py

Buffer.py:listens for incoming packets and fills up the buffer.The NormalPath.py requests new flows to process at a certain rate, and once the buffer is full packets are being forwarded to the FastPath.py process via it's own socket.

NormalPath.py:tracks incoming flows using a Count-Min-Sketch. Updates ControlPlane.py with it's sketch matrix(Let it be M) once every defined interval.

FastPath.py:listens to incoming flows.Implements the Fast-Path concept as shown in the paper, which has a hash table H where it stores top-k flows.The algorithm for updating the hash table is:

Testing & Paper reading——Sketchvisor

ControlPlane.py:listen for incoming flows from both FastPath and NormalPath.It can receive 2 typps of message denoted by the index in the message tuple 它可以接收由消息元组中的索引表示的两种类型的信息.

0-Message contains a FastPath hash table to be merged with the general H hash table.

1-Message contains a sketch matrix M, to be added with matrix addition to the general sketch.

七、Related Work

  • Sampling:widely used in sofeware-defined measurement for low measurement overhead, but inherently misses information and supports only coarse-grained measurement粗粒度测量.
  • Sketches: Many architechtures employ sketches as primitives to chieve fine-grained measurement for various measrurement tasks, but incurs high computational overhead.
  • TCAM:can be used to acheieve high-performance network measurement.
  • Rule matching: selectively processes only packets of interest, thereby reducing measurement overhead,but hash-table incurs much higher memory overhead than sketched-based overhead.
  • recover missing information:a matrix interpolation problem to enable the control plane to recover missing information via compressive sensing

八、Advantages

  • high throughput and high accuracy
  • fine grained 细粒度
  • accurately reason about hte behavior of high traafic load
  • resource efficient
  • recovers network-wide

九、Conclusion

Design and implement SketchVisor, a robust network-wide measurement architecture for software packet processing, with a primary goal of preserving performance and accuracy guarantees even under high traffic load.SketchVisor employs sketches as basic measurement primitives, and achieves high data plane perfomance with a fast path to offload sketch-based measurement under high traffic load.It further leverages compressive sensing to achieve accurate network-wide measurement. Experiments demonstrate that SketchVisor achieves high performance and high accuracy for a rich set of sketch-based solutions.

上一篇:Sketch的下载与安装


下一篇:喊话程序员:Sketch设计作图切图利器,你值得拥有。