haproxy(1)

2023-08-23 13:21:58

参考文档：

http://cbonte.github.io/haproxy-dconv/1.5/configuration.html

一、Haproxy

软件负载均衡一般通过两种方式来实现：基于操作系统的软负载实现和基于第三方应用的软负载实现。LVS 就是基于 Linux 操作系统实现的一种软负载，HAProxy 就是开源的并且基于第三应用实现的软负载。HAProxy 相比 LVS 的使用要简单很多，功能方面也很丰富。当前，HAProxy 支持两种主要的代理模式:"tcp"也即 4 层（大多用于邮件服务器、内部协议通信服务器等），和 7 层（HTTP）。在 4 层模式下， HAproxy仅在客户端和服务器之间转发双向流量。7 层模式下，HAProxy 会分析协议，并且能通过允许、拒绝、交换、增加、修改或者删除请求(request)或者回应(response)里指定内容来控制协议，这种操作要基于特定规则。详情可以在 HAProxy 官方网站(http://haproxy.1wt.eu)可以下载配置说明文档(configuration.txt)和架构文件(architecture.txt)作为参考。

注意：Haproxy采用的是事件驱动机制，因此要获取更好的性能，应该在Linux2.6或打了epoll()补丁的Linux2.4上运行haproxy1.2.5以上的版本。haproxy1.1默认使用的polling系统为select(),其处理的文件数达到数千之后性能便会急剧下降。1.2和1.3版本默认的是poll(),在某些操作系统上依旧有问题，但是据说在Solaris上表现不俗。

这里使用CentOS7 ，直接使用base源中的1.5.14进行安装

yum -y install haproxy

性能特性：

（1）单进程、事件驱动模型显著降低了上下文切换的开销；

（2）事件检查器（event checker）允许其在高并发连接中对任何连接的任何事件实现即时探测；

（3）在任何可用的情况下，单缓冲（single buffering）机制能以不复制任何数据的方式完成读写操作，这会节约大量的CPU时钟周期以及内存带宽；

（4）MRU（与LRU对应，最近最多使用...）内存分配器在固定大小的内存池中可以实现即时内存分配，这能够显著减少创建一个会话的时长；

（5）树型存储：侧重于使用作者多年前开发的弹性二叉树，实现了以O(log(N))的低开销来保持计时器命令、保持运行队列命令以及管理轮询以及最少连接队列；

（6）优化的HTTP首部分析：优化的首部分析功能避免了在HTTP首部分析过程中重读任何内存区域；

（7）精心地降低了昂贵的系统调用，大部分工作都在用户空间完成，如时间读取、缓冲聚合以及文件描述符的启用和禁用；

评估一个负载均衡器的性能往往从以下角度进行衡量

　　会话率：每秒能创建多少个会话

　　会话并发能力：能同时维持多少个会话

　　数据处理速度：会话处理的速度

　而haproxy子在这三方面都有不俗的表现...nginx也不错.可以说Nginx和haproxy各有优点，一半一半。

haproxy的官网和nginx一样，非常的详细。

Hello World

我们来使用haproxy简单的配置一个负载均衡，首先看官方的配置说明，非常简单：

HAProxy's configuration process involves 3 major sources of parameters :

  - the arguments from the command-line, which always take precedence（配置行优先）

  - the "global" section, which sets process-wide parameters (global设置的是进程级别的通用配置)

  - the proxies sections which can take form of "defaults", "listen", (代理配置块可以是defaults, listen, frontend, backend)

    "frontend" and "backend".

The configuration file syntax consists in lines beginning with a keyword

referenced in this manual, optionally followed by one or several parameters

delimited by spaces. If spaces have to be entered in strings, then they must be

preceded by a backslash ('\') to be escaped. Backslashes also have to be

escaped by doubling them.(空格必须使用\转义)

用spring-boot简单配置了2个java web项目，一个端口8081，一个端口8082

在配置文件最后一行加入以下内容：

listen tomcat *:8080

    balance     roundrobin

    server      srv1    192.168.1.103:8081 check

    server      srv2    192.168.1.103:8082 check

listen stats *:9091

    stats    enable

    stats    uri /admin?stats

    stats    realm HAProxy\ Statistics

    stats    auth  admin:admin

简要解释：

　　首先listen表示是一个代理，名字是tomcat，监听在8080端口上

　　采用的负载均衡算法是轮询roundrobin

有2个后端服务器，一个是8081，一个是8082，check表示都会自动做健康检查

　注意到还有一个listen stats 这个表示是开启web监控页面，监控在9091端口上，访问url为admin?stats

　　设置了访问stats页面的用户名密码都是admin

访问http://192.168.1.211:9091/admin?stats即可看到效果，并且已经实现了负载均衡。

二、配置简介

global：参数是进程级别的，通常和操作系统(OS)有关。这些参数一般只设置一次，如果配置无误，就不需
要再次配置进行修改。
defaults：配置默认参数的，这些参数可以被利用配置 frontend，backend。Listen 组件
frontend：接收请求的前虚拟节点，frontend 可以根据规则直接指定具体使用后端的
backend : 后端服务集群的配置，是真实的服务器，一个 backend 对应一个或者多个实体服务器
listen：frontend 和 backend 的组合体

Global

####################全局配置信息########################

#######参数是进程级的，通常和操作系统（OS）相关#########

global

maxconn                    #单个haproxy进程的默认最大连接数

log 127.0.0.1 local3            #[err warning info debug]

chroot /var/haproxy             #chroot运行的路径

uid                           #所属运行的用户uid

gid                           #所属运行的用户组

daemon                          #以后台形式运行haproxy

nbproc                         #进程数量(可以设置多个进程提高性能)

pidfile /var/run/haproxy.pid    #haproxy的pid存放路径,启动进程的用户必须有权限访问此文件

ulimit-n                   #ulimit的数量限制

可以通过配置syslog来获取haproxy的日志信息，这里配置的是local3，在syslog中再配置local3即可

还有：

maxpipes: haproxy使用pipe机制实现内核级别的tcp报文重组，maxpipes每进程可以使用的最大pipe数量，默认是maxconn/4

noepoll/nokqueue/nopoll/mosepoll... : 禁用某种事件机制

nosplice: 禁用内核级tcp报文的重组功能

spread-checks <0...50> : 分散健康监测0%-50%

还有很多的参数可以调整，详见haproxy的官方文档，但是这些参数Haproxy官方说明不建议用户手动调整。

三、proxy keywords matirx

haproxy关于代理的配置用法非常的多，haproxy官方给了个矩阵，这里描述一些常用的：

1. bind 指明监听的套接字

http://cbonte.github.io/haproxy-dconv/1.5/configuration.html#4-bind

bind [<address>]:<port_range> [, ...] [param*]

bind /<path> [, ...] [param*]

指明监听的套接字，以下是一些example

listen http_proxy

    bind :,:

    bind 10.0.0.1:,10.0.0.1:

    bind /var/run/ssl-frontend.sock user root mode  accept-proxy

listen http_https_proxy

    bind :

    bind : ssl crt /etc/haproxy/site.pem

listen http_https_proxy_explicit

    bind ipv6@:

    bind ipv4@public_ssl: ssl crt /etc/haproxy/site.pem

    bind unix@ssl-frontend.sock user root mode  accept-proxy

listen external_bind_app1

    bind fd@${FD_APP1}

2. balance：调度算法

http://cbonte.github.io/haproxy-dconv/1.5/configuration.html#4-balance

roundrobin:　动态算法，动态的加权轮询，支持慢启动以及运行时调整权重，后端有4095的限制

static-rr: 静态算法，静态的加权轮询，后端主机无限制

least-conn: 动态的最小连接，短连接(比如短于5s的连接)更适合roundrobin，因为没有必要去计算连接数... 推荐使用在长时间的会话场景中，比如LDAP, MySQL等协议；

first: The first server with available connection slots receives the connection. 根据服务器在列表中的位置，自上而下进行调度；前面服务器连接数达到上限将调度至下一个服务器

source: 源地址hash的算法

这里hash算法也支持map-based和consistent

http://cbonte.github.io/haproxy-dconv/1.5/configuration.html#hash-type

还可以指定hash函数

uri: 对URI的左半部分或者整个uri做hash计算，将对同一个uri的请求发往同一个uri server，特别适合于缓存服务器

This algorithm hashes either the left part of the URI (before the question mark) or the whole URI (if the "whole" parameter is present) and divides the hash value by the total weight of the running servers.

url_param: 跟上面类似，但是只计算uri中的params，适合登录类的网站

The URL parameter specified in argument will be looked up in the query string of each HTTP GET request.

hdr(<name>): 通过<name>指定的http请求首部做hash，如果没有，则做轮询。

　　hdr(Host)：基于请求的Host进行调度

　　hdr(Cookie)

rdp-cookie: 远程桌面协议，略

3. server ：后端主机配置

http://cbonte.github.io/haproxy-dconv/1.5/configuration.html#4-server

server <name> <address>[:[port]] [param*]

server first  10.1.1.1: cookie first  check inter

server second 10.1.1.2: cookie second check inter

server transp ipv4@

server backup ${SRV_BACKUP}: backup

server www1_dc1 ${LAN_DC1}.:

server www1_dc2 ${LAN_DC2}.:

支持的参数非常多：http://cbonte.github.io/haproxy-dconv/1.5/configuration.html#5.2

backup: 设定当前server为backup server；
check: 健康状态检测；
- 　　inter <delay>：连续两次检测之间的时间间隔；单位为ms, 默认为2000;
- 　　fall <count>: 连续多少次的失败检测将标记服务器为dead
- 　　rise <count> : 连续多少次的成功检测将服务器标记为available
- 　　addr: 通过此地址进行健康地址检测
cookie <value>：为当前server指定cookie值，用于实现基于cookie的会话粘性。
maxconn: 当前服务器接受的并发连接的最大数量，超出的连接将被放置在请求队列中
maxqueue: 请求队列的最大长度；
observe: 进行后端服务器的流量观测，根据流量判断后端server的健康状态；
port : 使用该端口进行健康监测
weight: 指定权重，默认为1，最大为256；0表示不被调度；
redir <prefix>: 重定向；所有发往此服务器的请求均以302响应；

关于check：

关于check，如果是http mode，还支持使用option-httpcheck，非常简单

http://cbonte.github.io/haproxy-dconv/1.5/configuration.html#4-option%20httpchk

option httpchk
option httpchk <uri>
option httpchk <method> <uri>
option httpchk <method> <uri> <version>

例如：option httpchk GET /index.html #心跳检测的文件　　

关于cookie：

Enable cookied-based persistence in a backend.
<name>: is the name of the cookie which will be monitored, modified or inserted in order to bring persistence.
这个cookie会以Set-Cookie的响应头发送给客户端，之后这个客户端发送的所有请求都会携带这个Cookie信息。因此需要注意不要和应用中其他的Cookie名称发生冲突。

发送cookie的方式：
rewrite: This keyword indicated that the cookie will be provided by the server and that haproxy will have to modify its value to set the server's identifier in it.
This mode is handy（方便的） when the mangement of complex combinations of "Set-Cookie" and "Cache-control" headers is left to the application. The application can then decide whether or not it is appropriate to emit a persistence cookie. Since all responses should be monitored, this mode doesn't work in HTTP tuunel mode. Unless the application behaviour is very complex and/or broken, it is advised not to start with this mode for new deployments. This keyword is incompatible with "insert" and "prefix".

insert: This keyword indicates that the persistence cookie will have to be inserted by haproxy in server response if the client did not already have a cookie that would have permitted it to access this server. When used without the "preserve" option, if the server emits a cookie with the same name, it will be remove before processing. For this reason, this mode can be used to upgrade existing configurations running in the "rewrite" mode . The cookie will be only be a session cookie and will not be stored on the client' disk. By default, unless the "indirect" option is added, the server will see the cookie emitted by the client. Due to caching effects, it is generally wise to add "nocache" or "postonly" keywords.

prefix

indirect: when this option is specified, no cookie will be emmited to a client which already have valid one for the server which processed the request. If the server sets such a cookie itself, it will be "removed", unless the "preserve" option is also set.

4. mode

{tcp | http | health}

tcp: 实例工作于tcp 模式，无法进行7层应用协议的解析，比如SSL SSH SMTP等协议
http: 最常用的http模式
health：健康模式，当有连接来的时候，只会返回OK并且关闭连接。

5. option forwardfor

option forwardfor [ except <network> ] [ header <name> ] [ if-none ]

Enable insertion of the X-Forwarded-For header to requests sent to servers, 跟Nignx中设置源Ip一样

# Public HTTP address also used by stunnel on the same machine

frontend www

    mode http

    option forwardfor except 127.0.0.1  # stunnel already adds the header

# Those servers want the IP Address in X-Client

backend www

    mode http

    option forwardfor header X-Client

6. ACL

acl拥有非常多的内容：http://cbonte.github.io/haproxy-dconv/1.5/configuration.html#7

ACL names(名称约束): ACL names must be formed from upper and lower case letters, digits, '-' (dash),'_' (underscore) , '.' (dot) and ':' (colon). ACL names are case-sensitive, which means that "my_acl" and "My_Acl" are two different ACLs.

只允许大小写字母,数字,'-','_','.',':' 而且是大小写敏感的

基本格式为: acl <aclname> <criterion> [flags] [operator] <value> ...

criterion(标准):

The criterion generally is the name of a sample fetch method, or one of its ACL specific(特殊的) declinations(倾向). The default test method is implied by the output type of this sample fetch method. The ACL declinations can describe alternate matching methods of a same sample fetch method. The sample fetch methods are the only ones supporting a conversion.

Sample fetch methods return data which can be of the following types :

boolean
integer (signed or unsigned)
IPv4 or IPv6 address
string
data block

flags: 是一些特殊的ACL标志位，其含义如下所示：

-i : ignore case during matching of all subsequent patterns.
-f : load patterns from a file.
-m : use a specific pattern matching method
-n : forbid the DNS resolutions
-M : load the file pointed by -f like a map file.
-u : force the unique id of the ACL
-- : force end of flags. Useful when a string looks like one of the flags.

最常用的是-i表示无视大小写。

其中criterion用法非常的多，分为Layer4,Layer5,Layer6,Layer7.

比如Layer7的 req.cook, req.hdr..

比如常用的path:This extracts the request's URL path, which starts at the first slash(斜线) and ends before the question mark(问号) (without the host part)，其中path又有多种用法：

path : exact string match 精确瓶胚
path_beg : prefix match
path_dir : subdir match
path_dom : domain match
path_end : suffix match
path_len : length match
path_reg : regex match
path_sub : substring match

比如url: This extracts the request's URL as presented in the request，也有类似的用法

url : exact string match
url_beg : prefix match
url_dir : subdir match
url_dom : domain match
url_end : suffix match
url_len : length match
url_reg : regex match
url_sub : substring match

还有url_param等等

7. 访问控制

访问控制：
block { if | unless } <condition>

http-request 操控http request
acl nagios src 192.168.129.3
acl local_net src 192.168.0.0/16
acl auth_ok http_auth(L1)

http-request allow if nagios
http-request allow if local_net auth_ok
http-request auth realm Gimme if local_net auth_ok
http-request deny

http-response 操纵http response
acl key_acl res.hdr(X-Acl-Key) -m found
acl myhost hdr(Host) -f myhost.lst

http-response add-acl(myhost.lst) %[res.hdr(X-Acl-Key)] if key_acl
http-response del-acl(myhost.lst) %[res.hdr(X-Acl-Key)] if key_acl

四、Demo

http://rickyhui.blog.51cto.com/10570875/1680676

       ####################全局配置信息########################

       #######参数是进程级的，通常和操作系统（OS）相关#########

global

       maxconn                    #默认最大连接数

       log 127.0.0.1 local3            #[err warning info debug]

       chroot /var/haproxy             #chroot运行的路径

       uid                           #所属运行的用户uid

       gid                           #所属运行的用户组

       daemon                          #以后台形式运行haproxy

       nbproc                         #进程数量(可以设置多个进程提高性能)

       pidfile /var/run/haproxy.pid    #haproxy的pid存放路径,启动进程的用户必须有权限访问此文件

       ulimit-n                   #ulimit的数量限制 

       #####################默认的全局设置######################

       ##这些参数可以被利用配置到frontend，backend，listen组件##

defaults

       log global 　　　　　　　　　　　　

       mode http                       #所处理的类别 (#7层 http;4层tcp  )

       maxconn                    #最大连接数

       option httplog                  #日志类别http日志格式

       option httpclose                #每次请求完毕后主动关闭http通道，即显示不支持客户端的长连接

       option dontlognull              #不记录健康检查的日志信息

       option forwardfor               #如果后端服务器需要获得客户端真实ip需要配置的参数，可以从Http Header中获得客户端ip

       option redispatch               #serverId对应的服务器挂掉后,强制定向到其他健康的服务器

       option abortonclose             #当服务器负载很高的时候，自动结束掉当前队列处理比较久的连接

       stats refresh                 #统计页面刷新间隔

       retries                        #3次连接失败就认为服务不可用，也可以通过后面设置

       balance roundrobin              #默认的负载均衡的方式,轮询方式

      #balance source                  #默认的负载均衡的方式,类似nginx的ip_hash

      #balance leastconn               #默认的负载均衡的方式,最小连接

       contimeout                  #连接超时

       clitimeout                 #客户端超时

       srvtimeout                 #服务器超时

       timeout check               #心跳检测超时 

       ####################监控页面的设置#######################

listen admin_status                    #Frontend和Backend的组合体,监控组的名称，按需自定义名称

        bind 0.0.0.0:             #监听端口

        mode http                      #http的7层模式

        log 127.0.0.1 local3 err       #错误日志记录

        stats refresh 5s               #每隔5秒自动刷新监控页面

        stats uri /admin?stats         #监控页面的url

        stats realm itnihao\ itnihao   #监控页面的提示信息

        stats auth admin:admin         #监控页面的用户和密码admin,可以设置多个用户名

        stats auth admin1:admin1       #监控页面的用户和密码admin1

        stats hide-version             #隐藏统计页面上的HAproxy版本信息

        stats admin if TRUE            #手工启用/禁用,后端服务器(haproxy-1.4.9以后版本) 

       errorfile  /etc/haproxy/errorfiles/.http

       errorfile  /etc/haproxy/errorfiles/.http

       errorfile  /etc/haproxy/errorfiles/.http

       errorfile  /etc/haproxy/errorfiles/.http

       errorfile  /etc/haproxy/errorfiles/.http 

       #################HAProxy的日志记录内容设置###################

       capture request  header Host           len

       capture request  header Content-Length len

       capture request  header Referer        len

       capture response header Server         len

       capture response header Content-Length len

       capture response header Cache-Control  len  

       #######################网站监测listen配置#####################

       ###########此用法主要是监控haproxy后端服务器的监控状态############

listen site_status

       bind 0.0.0.0:                    #监听端口

       mode http                            #http的7层模式

       log 127.0.0.1 local3 err             #[err warning info debug]

       monitor-uri /site_status             #网站健康检测URL，用来检测HAProxy管理的网站是否可以用，正常返回200，不正常返回503

       acl site_dead nbsrv(server_web) lt  #定义网站down时的策略当挂在负载均衡上的指定backend的中有效机器数小于1台时返回true

       acl site_dead nbsrv(server_blog) lt

       acl site_dead nbsrv(server_bbs)  lt

       monitor fail if site_dead            #当满足策略的时候返回503，网上文档说的是500，实际测试为503

       monitor-net 192.168.16.2/          #来自192.168.16.2的日志信息不会被记录和转发

       monitor-net 192.168.16.3/ 

       ########frontend配置############

       #####注意，frontend配置里面可以定义多个acl进行匹配操作########

frontend http_80_in

       bind 0.0.0.0:      #监听端口，即haproxy提供web服务的端口，和lvs的vip端口类似

       mode http            #http的7层模式

       log global           #应用全局的日志配置

       option httplog       #启用http的log

       option httpclose     #每次请求完毕后主动关闭http通道，HA-Proxy不支持keep-alive模式

       option forwardfor    #如果后端服务器需要获得客户端的真实IP需要配置次参数，将可以从Http Header中获得客户端IP

       ########acl策略配置#############

       acl itnihao_web hdr_reg(host) -i ^(www.itnihao.cn|ww1.itnihao.cn)$

       #如果请求的域名满足正则表达式中的2个域名返回true -i是忽略大小写

       acl itnihao_blog hdr_dom(host) -i blog.itnihao.cn

       #如果请求的域名满足www.itnihao.cn返回true -i是忽略大小写

       #acl itnihao    hdr(host) -i itnihao.cn

       #如果请求的域名满足itnihao.cn返回true -i是忽略大小写

       #acl file_req url_sub -i  killall=

       #在请求url中包含killall=，则此控制策略返回true,否则为false

       #acl dir_req url_dir -i allow

       #在请求url中存在allow作为部分地址路径，则此控制策略返回true,否则返回false

       #acl missing_cl hdr_cnt(Content-length) eq 0

       #当请求的header中Content-length等于0时返回true 

       ########acl策略匹配相应#############

       #block if missing_cl

       #当请求中header中Content-length等于0阻止请求返回403

       #block if !file_req || dir_req

       #block表示阻止请求，返回403错误，当前表示如果不满足策略file_req，或者满足策略dir_req，则阻止请求

       use_backend  server_web  if itnihao_web

       #当满足itnihao_web的策略时使用server_web的backend

       use_backend  server_blog if itnihao_blog

       #当满足itnihao_blog的策略时使用server_blog的backend

       #redirect prefix http://blog.itniaho.cn code 301 if itnihao

       #当访问itnihao.cn的时候，用http的301挑转到http://192.168.16.3

       default_backend server_bbs

       #以上都不满足的时候使用默认server_bbs的backend 

       ##########backend的设置##############

       #下面我将设置三组服务器 server_web，server_blog，server_bbs

###########################backend server_web#############################

backend server_web

       mode http            #http的7层模式

       balance roundrobin   #负载均衡的方式，roundrobin平均方式

       cookie SERVERID      #允许插入serverid到cookie中，serverid后面可以定义

       option httpchk GET /index.html #心跳检测的文件

       server web1 192.168.16.2: cookie web1 check inter  rise  fall  weight

       #服务器定义，cookie 1表示serverid为web1，check inter 1500是检测心跳频率rise 3是3次正确认为服务器可用，

       #fall 3是3次失败认为服务器不可用，weight代表权重

       server web2 192.168.16.3: cookie web2 check inter  rise  fall  weight

       #服务器定义，cookie 1表示serverid为web2，check inter 1500是检测心跳频率rise 3是3次正确认为服务器可用，

       #fall 3是3次失败认为服务器不可用，weight代表权重 

###################################backend server_blog###############################################

backend server_blog

       mode http            #http的7层模式

       balance roundrobin   #负载均衡的方式，roundrobin平均方式

       cookie SERVERID      #允许插入serverid到cookie中，serverid后面可以定义

       option httpchk GET /index.html #心跳检测的文件

       server blog1 192.168.16.2: cookie blog1 check inter  rise  fall  weight

       #服务器定义，cookie 1表示serverid为web1，check inter 1500是检测心跳频率rise 3是3次正确认为服务器可用，fall 3是3次失败认为服务器不可用，weight代表权重

       server blog2 192.168.16.3: cookie blog2 check inter  rise  fall  weight

        #服务器定义，cookie 1表示serverid为web2，check inter 1500是检测心跳频率rise 3是3次正确认为服务器可用，fall 3是3次失败认为服务器不可用，weight代表权重 

###################################backend server_bbs############################################### 

backend server_bbs

       mode http            #http的7层模式

       balance roundrobin   #负载均衡的方式，roundrobin平均方式

       cookie SERVERID      #允许插入serverid到cookie中，serverid后面可以定义

       option httpchk GET /index.html #心跳检测的文件

       server bbs1 192.168.16.2: cookie bbs1 check inter  rise  fall  weight

       #服务器定义，cookie 1表示serverid为web1，check inter 1500是检测心跳频率rise 3是3次正确认为服务器可用，fall 3是3次失败认为服务器不可用，weight代表权重

       server bbs2 192.168.16.3: cookie bbs2 check inter  rise  fall  weight

        #服务器定义，cookie 1表示serverid为web2，check inter 1500是检测心跳频率rise 3是3次正确认为服务器可用，fall 3是3次失败认为服务器不可用，weight代表权重

码农公寓