Nginx 内置变量,细化规则,真实IP获取及限制连接请求

希望下周测试之后能用起来!!!感觉很有用的。

http://www.bzfshop.net/article/176.html

http://www.cr173.com/html/19761_1.html

http://blog.pixelastic.com/2013/09/27/understanding-nginx-location-blocks-rewrite-rules/

你 Google 不到的配置

 

很多时候,我们的网站不是简单的  普通用户IE浏览器  ——->  你的服务器  的结构, 考虑到网络访问速度问题,我们中间可能会有各种 网络加速(CDN)。以本网站  www.bzfshop.net 为例,考虑到网站的安全性和访问加速,我们的架构是:

普通用户浏览器  —–>  360网站卫士加速(CDN,360防 CC,DOS攻击) ——>  阿里云加速服务器(我们自己建的CDN,阿里云盾) —-> 源服务器(PHP 程序部署在这里,iptables, nginx 安全配置)

可以看到,我们的网站中间经历了好几层的透明加速和安全过滤, 这种情况下,我们就不能用上面的“普通配置”。因为上面基于  源IP的限制 结果就是,我们把 360网站卫士  或者  阿里云盾 给限制了,因为这里“源IP”地址不再是  普通用户的IP,而是中间  网络加速服务器 的IP地址。我们需要限制的是 最前面的普通用户,而不是中间为我们做加速的 加速服务器。

 

2.1 现在我们面对的最直接的问题就是, 经过这么多层加速,我怎么得到“最前面普通用户的 IP 地址”呢?

(这里只说明结果,不了解 Http 协议的人请自行 Google 或者 Wikipedia  http://zh.wikipedia.org/zh-cn/X-Forwarded-For  )

当一个 CDN 或者透明代理服务器把用户的请求转到后面服务器的时候,这个 CDN 服务器会在 Http 的头中加入 一个记录

X-Forwarded-For :  用户IP, 代理服务器IP

如果中间经历了不止一个 代理服务器,像 www.bzfshop.net 中间建立多层代理之后,这个 记录会是这样

X-Forwarded-For :  用户IP, 代理服务器1-IP, 代理服务器2-IP, 代理服务器3-IP, ….

可以看到经过好多层代理之后, 用户的真实IP 在第一个位置, 后面会跟一串 中间代理服务器的IP地址,从这里取到用户真实的IP地址,针对这个 IP 地址做限制就可以了,

 

2.2 经过多层CDN之后取得原始用户的IP地址,nginx 配置

 

 

2.3 测试、测试

很多时候,你在网上搜到一堆配置,你照着做了,但是你怎么知道这个配置真的正确 ?是的,我们需要自己做一个有效的真实的测试,验证它是正确的之后才真的采用它

Nginx 这种配置怎么测试呢? 用 Echo 模块,如果你知道 Nginx 这个模块的话。

以 www.bzfshop.net 网站为例, 我们首先测试这个 $clientRealIp 是否真的是我们客户机的 IP 地址,在网站上增加一个访问地址,比如  www.bzfshop.net/nginx-test,配置如下:

接下来,用你的浏览器访问  www.bzfshop.net/nginx-test,这个时候会弹出框下载一个文件 nginx-test,下载完成用 notepad++ 打开,里面就是一个 IP 地址

访问 www.ip138.com ,看看这个里面记录的IP地址是否和 ip138 侦测的IP 一致?

通过这种方式,你就可以对 Nginx 的一些复杂配置做有效的测试。

经过测试,我们确认 通过多层CDN 之后,$clientRealIp 仍然是有效的原始用户IP地址

 

 2.4 根据用户的真实 IP 做连接限制

下面是修改之后的 Nginx 配置:

 

 后记:

通过上面的配置,现在你的网站可以完美的配合任何 网络加速服务(CDN)的使用,并且同时能保证对“最终用户的限制”。

写这篇文章的原因是因为 我们最近把 www.bzfshop.net 迁移到  360网站卫士(wangzhan.360.cn)  上了,使用 360网站卫士 做我们的加速服务器和安全保护,同时我们网站自身 nginx 本身也配置了防止攻击的安全措施, 结果我们的安全配置把  360网站卫士的加速服务器给 盾 掉了,因为所有用户的访问都通过加速服务器过来,很明显加速服务器超过了我们的“连接限制”。经过上面的改造之后,现在我们的 Nginx 安全配置能够和 360加速服务器 完美配合,同时能对终端的用户访问作限制。

写下这些文字,希望对看到这篇文章的朋友会有用。

~~~~~~~~~~~~~~~~~~~~

经常需要配置Nginx ,其中有许多以 $ 开头的变量,经常需要查阅nginx 所支持的变量。

可能是对 Ngixn资源不熟悉,干脆就直接读源码,分析出支持的变量。

Nginx支持的http变量实现在 ngx_http_variables.c 的 ngx_http_core_variables存储实现

ngx_http_core_variables

  1 static ngx_http_variable_t ngx_http_core_variables[] = {

  2 
  3     { ngx_string("http_host"), NULL, ngx_http_variable_header,
  4       offsetof(ngx_http_request_t, headers_in.host), 0, 0 },
  5 
  6     { ngx_string("http_user_agent"), NULL, ngx_http_variable_header,
  7       offsetof(ngx_http_request_t, headers_in.user_agent), 0, 0 },
  8 
  9     { ngx_string("http_referer"), NULL, ngx_http_variable_header,
 10       offsetof(ngx_http_request_t, headers_in.referer), 0, 0 },
 11 
 12 #if (NGX_HTTP_GZIP)
 13     { ngx_string("http_via"), NULL, ngx_http_variable_header,
 14       offsetof(ngx_http_request_t, headers_in.via), 0, 0 },
 15 #endif
 16 
 17 #if (NGX_HTTP_PROXY || NGX_HTTP_REALIP)
 18     { ngx_string("http_x_forwarded_for"), NULL, ngx_http_variable_header,
 19       offsetof(ngx_http_request_t, headers_in.x_forwarded_for), 0, 0 },
 20 #endif
 21 
 22     { ngx_string("http_cookie"), NULL, ngx_http_variable_headers,
 23       offsetof(ngx_http_request_t, headers_in.cookies), 0, 0 },
 24 
 25     { ngx_string("content_length"), NULL, ngx_http_variable_header,
 26       offsetof(ngx_http_request_t, headers_in.content_length), 0, 0 },
 27 
 28     { ngx_string("content_type"), NULL, ngx_http_variable_header,
 29       offsetof(ngx_http_request_t, headers_in.content_type), 0, 0 },
 30 
 31     { ngx_string("host"), NULL, ngx_http_variable_host, 0, 0, 0 },
 32 
 33     { ngx_string("binary_remote_addr"), NULL,
 34       ngx_http_variable_binary_remote_addr, 0, 0, 0 },
 35 
 36     { ngx_string("remote_addr"), NULL, ngx_http_variable_remote_addr, 0, 0, 0 },
 37 
 38     { ngx_string("remote_port"), NULL, ngx_http_variable_remote_port, 0, 0, 0 },
 39 
 40     { ngx_string("server_addr"), NULL, ngx_http_variable_server_addr, 0, 0, 0 },
 41 
 42     { ngx_string("server_port"), NULL, ngx_http_variable_server_port, 0, 0, 0 },
 43 
 44     { ngx_string("server_protocol"), NULL, ngx_http_variable_request,
 45       offsetof(ngx_http_request_t, http_protocol), 0, 0 },
 46 
 47     { ngx_string("scheme"), NULL, ngx_http_variable_scheme, 0, 0, 0 },
 48 
 49     { ngx_string("request_uri"), NULL, ngx_http_variable_request,
 50       offsetof(ngx_http_request_t, unparsed_uri), 0, 0 },
 51 
 52     { ngx_string("uri"), NULL, ngx_http_variable_request,
 53       offsetof(ngx_http_request_t, uri),
 54       NGX_HTTP_VAR_NOCACHEABLE, 0 },
 55 
 56     { ngx_string("document_uri"), NULL, ngx_http_variable_request,
 57       offsetof(ngx_http_request_t, uri),
 58       NGX_HTTP_VAR_NOCACHEABLE, 0 },
 59 
 60     { ngx_string("request"), NULL, ngx_http_variable_request_line, 0, 0, 0 },
 61 
 62     { ngx_string("document_root"), NULL,
 63       ngx_http_variable_document_root, 0, NGX_HTTP_VAR_NOCACHEABLE, 0 },
 64 
 65     { ngx_string("realpath_root"), NULL,
 66       ngx_http_variable_realpath_root, 0, NGX_HTTP_VAR_NOCACHEABLE, 0 },
 67 
 68     { ngx_string("query_string"), NULL, ngx_http_variable_request,
 69       offsetof(ngx_http_request_t, args),
 70       NGX_HTTP_VAR_NOCACHEABLE, 0 },
 71 
 72     { ngx_string("args"),
 73       ngx_http_variable_request_set,
 74       ngx_http_variable_request,
 75       offsetof(ngx_http_request_t, args),
 76       NGX_HTTP_VAR_CHANGEABLE|NGX_HTTP_VAR_NOCACHEABLE, 0 },
 77 
 78     { ngx_string("is_args"), NULL, ngx_http_variable_is_args,
 79       0, NGX_HTTP_VAR_NOCACHEABLE, 0 },
 80 
 81     { ngx_string("request_filename"), NULL,
 82       ngx_http_variable_request_filename, 0,
 83       NGX_HTTP_VAR_NOCACHEABLE, 0 },
 84 
 85     { ngx_string("server_name"), NULL, ngx_http_variable_server_name, 0, 0, 0 },
 86 
 87     { ngx_string("request_method"), NULL,
 88       ngx_http_variable_request_method, 0,
 89       NGX_HTTP_VAR_NOCACHEABLE, 0 },
 90 
 91     { ngx_string("remote_user"), NULL, ngx_http_variable_remote_user, 0, 0, 0 },
 92 
 93     { ngx_string("body_bytes_sent"), NULL, ngx_http_variable_body_bytes_sent,
 94       0, 0, 0 },
 95 
 96     { ngx_string("request_completion"), NULL,
 97       ngx_http_variable_request_completion,
 98       0, 0, 0 },
 99 
100     { ngx_string("request_body"), NULL,
101       ngx_http_variable_request_body,
102       0, 0, 0 },
103 
104     { ngx_string("request_body_file"), NULL,
105       ngx_http_variable_request_body_file,
106       0, 0, 0 },
107 
108     { ngx_string("sent_http_content_type"), NULL,
109       ngx_http_variable_sent_content_type, 0, 0, 0 },
110 
111     { ngx_string("sent_http_content_length"), NULL,
112       ngx_http_variable_sent_content_length, 0, 0, 0 },
113 
114     { ngx_string("sent_http_location"), NULL,
115       ngx_http_variable_sent_location, 0, 0, 0 },
116 
117     { ngx_string("sent_http_last_modified"), NULL,
118       ngx_http_variable_sent_last_modified, 0, 0, 0 },
119 
120     { ngx_string("sent_http_connection"), NULL,
121       ngx_http_variable_sent_connection, 0, 0, 0 },
122 
123     { ngx_string("sent_http_keep_alive"), NULL,
124       ngx_http_variable_sent_keep_alive, 0, 0, 0 },
125 
126     { ngx_string("sent_http_transfer_encoding"), NULL,
127       ngx_http_variable_sent_transfer_encoding, 0, 0, 0 },
128 
129     { ngx_string("sent_http_cache_control"), NULL, ngx_http_variable_headers,
130       offsetof(ngx_http_request_t, headers_out.cache_control), 0, 0 },
131 
132     { ngx_string("limit_rate"), ngx_http_variable_request_set_size,
133       ngx_http_variable_request_get_size,
134       offsetof(ngx_http_request_t, limit_rate),
135       NGX_HTTP_VAR_CHANGEABLE|NGX_HTTP_VAR_NOCACHEABLE, 0 },
136 
137     { ngx_string("nginx_version"), NULL, ngx_http_variable_nginx_version,
138       0, 0, 0 },
139 
140     { ngx_string("hostname"), NULL, ngx_http_variable_hostname,
141       0, 0, 0 },
142 
143     { ngx_string("pid"), NULL, ngx_http_variable_pid,
144       0, 0, 0 },
145 
146     { ngx_null_string, NULL, NULL, 0, 0, 0 }
147 };

把这些变量提取下,总结如下:

Nginx 内置变量,细化规则,真实IP获取及限制连接请求Nginx 内置变量,细化规则,真实IP获取及限制连接请求Nginx 内置变量,细化规则,真实IP获取及限制连接请求~~~~~~~~~~~~~~~~~~~~~~~~~~~

实际应用

如果作为代理服务器,我们需要限制每个用户的请求速度和链接数量,但是,由于一个页面有多个子资源,如果毫无选择的都进行限制,那就会出现很多不必要的麻烦,如:一个页面有40个子资源,那么如果想让一个页面完整的显示,就需要将请求速度和连接数都调整到40,以此达到不阻塞用户正常请求,而这个限制,对服务器性能影响很大,几百用户就能把一台nginx的处理性能拉下来。

所以我们需要制定哪些请求是需要进行限制的,如html页面;哪些是不需要限制的,如css、js、图片等,这样就需要通过配置对应的location进一步细化。

我们不对css、js、gif、png,jpg等进行连接限制,而对除此之外的链接进行限制

http {

    limit_conn_zone $binary_remote_addr zone=addr:10m;
    limit_req_zone $binary_remote_addr zone=one:10m rate=5r/s;

    ...

    server {

        ...

       location ~ .*\.(gif|png|css|js|icon)$ {
            proxy_set_header Host $http_host;
            proxy_set_header X-Real_IP $remote_addr;
            proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        }

        location ~* .*\.(jpeg|jpg|JPG)$ {
            proxy_set_header Host $http_host;
            proxy_set_header X-Real_IP $remote_addr;
            proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        #    image_filter resize 480 -;
        #    image_filter_jpeg_quality 50;
        #    image_filter_sharpen 10;
        #    image_filter_buffer 4M;
        }

        location / {
            proxy_set_header Host $http_host;
            proxy_set_header X-Real_IP $remote_addr;
            proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
            #limit
            limit_conn addr 3;
            limit_req zone=one burst=5;
        }
}

Location配置简单介绍:

语法规则: location [=|~|~*|^~] /uri/ { … }

= 开头表示精确匹配

^~ 开头表示uri以某个常规字符串开头,理解为匹配 url路径即可。nginx不对url做编码,因此请求为/static/20%/aa,可以被规则^~ /static/ /aa匹配到(注意是空格)。

~ 开头表示区分大小写的正则匹配

~*  开头表示不区分大小写的正则匹配

!~和!~*分别为区分大小写不匹配及不区分大小写不匹配 的正则

/ 通用匹配,任何请求都会匹配到。

多个location配置的情况下匹配顺序为(参考资料而来,还未实际验证,试试就知道了,不必拘泥,仅供参考):

首先匹配 =,其次匹配^~, 其次是按文件中顺序的正则匹配,最后是交给 / 通用匹配。当有匹配成功时候,停止匹配,按当前匹配规则处理请求。

~~~~~~~~~~~~~~~~~~

现在需要作如下的重定向:

1
2
3
4
5
192.168.71.51/log.aspx –> 192.168.80.147:8338/log
 
192.168.71.51/do.aspx –> 192.168.80.147:8338/do
 
192.168.71.51/uplog.aspx –> 192.168.80.147:8338/log

可以如下配置:

Nginx 内置变量,细化规则,真实IP获取及限制连接请求
Nginx 内置变量,细化规则,真实IP获取及限制连接请求
……
server {
        listen       6061;
        server_name  192.168.71.51;

    rewrite  ^(.*)(?i)uplog.aspx(.*)$  $1log$2  break;
    rewrite  ^(.*)(?i)log.aspx(.*)$  $1log$2  break;
    rewrite  ^(.*)(?i)do.aspx(.*)$  $1do$2  break;
    

        location / { 
           proxy_pass                  http://log; 
           proxy_redirect              off; 
           proxy_set_header            Host $host; 
           proxy_set_header            Remote_Addr $remote_addr; 
           proxy_set_header   X-REAL-IP  $remote_addr; 
           proxy_set_header            X-Forwarded-For $proxy_add_x_forwarded_for; 
          
           proxy_connect_timeout       90; 
           proxy_send_timeout          90; 
           proxy_read_timeout          90; 
           proxy_buffer_size           4k; 
           proxy_buffers               4 32k; 
           proxy_busy_buffers_size     64k; 
           proxy_temp_file_write_size 64k;
        }
……
Nginx 内置变量,细化规则,真实IP获取及限制连接请求
Nginx 内置变量,细化规则,真实IP获取及限制连接请求

关于这里的rewrite配置主要说明以下几点:

  1.  rewrite用法: rewrite 正则 替换 标志位
  2. 第一行配置和第二行配置顺序不能颠倒,因为nginx会从上往下依次rewrite(break在这里不起作用);
  3. (?!)表示忽略大小写匹配(网上说的是~*,但好像不起作用,我的nginx版本是1.0.12);
  4.  1,2表示前面正则表达式匹配到的部分;
  5.  rewrite可以在server里也可以在location里,nginx会首先执行server里的rewrite,然后才会执行location,意味着location的是重写后的url,之后还会执行location里的rewrite,最后nginx还会拿结果去执行剩下的location。

关于nginx的rewrite详细用法可以参考详细参考文档:http://blog.cafeneko.info/2010/10/nginx_rewrite_note/(很详细)

 

根据url参数location

实际开发中经常有根据请求参数来路由到不同请求处理者的情况,根据POST请求参数需要些nginx插件,这里主要简单介绍下如何根据GET参数来路由。

还是上面的配置文件。比如我们希望访问http://192.168.71.51:6061/do1.aspx?t=1212&c=uplog当url中的参数c为config或uplog的时候(忽略大小写)我们路由到其他地方:

首先增加一个upstream,比如:

……
upstream other {  
    server 192.168.71.41:2210;

     }
……

然后在location里增加如下的判断即可:

Nginx 内置变量,细化规则,真实IP获取及限制连接请求
Nginx 内置变量,细化规则,真实IP获取及限制连接请求
……
location / { 

       if ( $query_string ~* ^(.*)c=config\b|uplog\b(.*)$ ){
         proxy_pass                  http://other; 
       }
……
Nginx 内置变量,细化规则,真实IP获取及限制连接请求
Nginx 内置变量,细化规则,真实IP获取及限制连接请求

关键是标红的行,$query_string表示url参数,后面是标准的正则匹配,需要的注意的是nginx中if有很多限制,语法很苛刻,具体参看上面的文档。

 

很简单却很实用的配置,希望能帮到正在找这方面信息的同学。

~~~~~~~~~~~~~~~~~~

I recently moved a cakePHP website from an Apache server to an Nginx one. I had to translate url rewriting rules from one syntax to the other, and here is what I learned.

First of all, Nginx internal logic for processing rewrite rules is not as straightforward as Apache. In Apache, rules are processed in the order in which they appear in your config file/.htaccess. In Nginx, they follow a more complex pattern.

Initial Apache rules

First of all, here are the (simplified) set of rules I had to convert :

RewriteRule ^(css|js)/packed_(.*)$ $1/packed/$2 [L]

RewriteRule ^files/([0-9]{4})/([0-9]{2})/([0-9]{2})/([[:alnum:]]{8}-[[:alnum:]]{4}-[[:alnum:]]{4}-[[:alnum:]]{4}-[[:alnum:]]{12})/(.*)\.(.{3,4})    /files/$1/$2/$3/$4.$6 [L]

RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^(.*)$ index.php?url=$1 [QSA,L]

The first rule deals with compressed css and js files. Minified css and js files are saved in /css/packed/ with a filename made of a md5 hash of the original filenames and a timestamp. So a url of/css/packed_6e4f31ffc48b6_1330851887.css will actually return the file located in/css/packed/6e4f31ffc48b6_1330851887.css

The second rule is about media files uploaded on the server. Each uploaded file is stored in the /files/ directory, in a subfolder made from the uploading date (like/files/2012/08/25/). The actual file is given a UUID when saved, and this UUID is used as its filename on disk. The rewrite rule allow the use of any custom filename when linking the file. This helps for SEO purposes as well as making it more user-friendly when we present a download to our users. So /files/2012/08/25/50483446-4b00-4d5b-8498-763e45a3e447/Subscription_form.pdf actually returns the file at/files/2012/09/06/50483446-4b00-4d5b-8498-763e45a3e447.pdf

And the last rule is the default cakePHP rewrite rule. It first checks if the requested url points to an existing directory or file, and if not dispatch it to the main entry point :index.php with the requested url as a parameter.

Converting it to Nginx

Rewrite rules in Nginx are usually found in location blocks. There are several ways you can define a location block, and it affects the order in which the rules will be parsed.

Nginx first checks for location = blocks. Those blocks are used to catch an exact match of the requested url. Once such a block is found, its content is applied, and Nginx stops looking for more matches.

location = /my-exact-file.html {
  rewrite /my-exact-file.html http://external-website.com/
}

In this example, a request for /my-exact-file.html will be redirected tohttp://external-website.com. Note that you need to repeat the url in both thelocation = block and the rewrite rule.

The location = is of very limited use as it only accepts an exact match on a string. Much more useful are the location ~ blocks that performs matches on regex (and the location ~* for a case-insensitive version).

Such blocks are tested after the location =ones, in the order they appear in your configuration file. Once a block matches, Nginx applies its content but does not stop. It keeps looking for other blocks that might match and apply them. It's up to you, in the block content, to define if the parsing should stop, using the break command.

location ~ /(css|js)/packed_ {
  rewrite ^/(css|js)/packed_(.*)$ /$1/packed/$2 break;
}
location ~ /files {
  rewrite ^/files/(.*)/(.*)/(.*)\.(.*)$ /files/$1/$2.$4 break;
}

In the first rule I'm looking for any /css/packed_* or /js/packed_* request, and converting them to /css/packed/* or /js/packed/*. Note the use of backreferences in the rewrite using $x variables. In the second rule I simplified the original regex from Apache to catch the /2012/08/23/ in $1, the UUID in $2, the filename in $3 and the extension in $4 and rewriting the request to the correct file on disk.

Both rewrites ends with the break flag. It tells Nginx that it should stop looking for other location ~ blocks matching the requested url and just serve the file. Another useful flag is last, which tells Nginx to restart its whole url matching process from the beginning but this time using the newly rewritten url.

There is one last location block that we can use, and it's the simple location, without any prefix. These location blocks will be checked last, if no location =or location ~ had stopped the processing. They are especially good for a last "catch all" solution, and we are going to use them to dispatch urls to index.php

location / {
  try_files $uri /index.php?url=$request_uri;
}

Using location /, we'll catch any remaining requests. The try_files command will test every one of its arguments in order to see if they exist on disk and serve them if they do. So in our example it will first check for the requested uri, and if such a file exists, will serve it. Otherwise it will simply dispatch it to the main index.php with the requested url as an argument and cakePHP will do the rest.

There is one last thing we must do, it's telling Nginx to pass any .php file to the PHP fastcgi. This is quite easy using a location ~ block matching any .php file. This will even apply to files served through try_files.

location ~ \.php$ {
  fastcgi_pass   127.0.0.1:9000;
  fastcgi_index  index.php;
  fastcgi_intercept_errors on;
  include fastcgi.conf;
}

Conclusion

Wrapping your mind around the order in which Nginx applies your rewrites is not easy at first. I hope this post helped you making sense of it.

Note that there also is the location ^~ block but I found it to be of very limited used as its behavior can be replicated with the more generic location ~ blocks.

上一篇:用MVC做支付宝手机网页支付问题


下一篇:Java通过阅读器书写器实现按编码读写字符