出现502 bad gateway错误的原因

1.php-fpm进程数不够用
2.Linux内核打开文件数量小
3.脚本执行时间超时
4.缓存设置比较小

网站间歇性出现502,第一反应不是程序的问题,而是nginx服务器的问题,因为这是代理服务器出现的问题,代理服务器并没有安装php 排除第一中情况。

于此想到的是可能是超时,所以我把超时修改了一些
一下是服务器原配置(重点配置)

http部分

server_names_hash_bucket_size 64;
client_header_buffer_size 128k;
large_client_header_buffers 4 32k;
client_max_body_size 50m;

keepalive_timeout 60;
fastcgi_connect_timeout 60;
fastcgi_send_timeout 60;
fastcgi_read_timeout 600;
fastcgi_buffer_size 64k;
fastcgi_buffers 4 128k;
fastcgi_busy_buffers_size 128k;
fastcgi_temp_file_write_size 256k;

gzip_buffers 4 128k;

server部分

upstream myweb {
server 10.10.10.1:80 max_fails=3 fail_timeout=30s;
server 10.10.10.2:80 max_fails=3 fail_timeout=30s;
ip_hash;
}

location / {
proxy_pass http://myweb;
proxy_next_upstream error timeout invalid_header http_500 http_502 http_503;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto https;
proxy_redirect off;
}

第一步修改超时 只改了http部分 缓冲基本上都加了几倍

server_names_hash_bucket_size 512;
client_header_buffer_size 512k;
large_client_header_buffers 16 128k;
client_max_body_size 256m;

keepalive_timeout 600;
fastcgi_connect_timeout 600;
fastcgi_send_timeout 600;
fastcgi_read_timeout 600;
fastcgi_buffer_size 256k;
fastcgi_buffers 16 512k;
fastcgi_busy_buffers_size 512k;
fastcgi_temp_file_write_size 1024k;

gzip_buffers 16 512k;

观察nginx出现502的频率并没有下降,还是和以前一样

第二步修改server代理服务超时

location / {
proxy_pass http://myweb;
proxy_next_upstream error timeout invalid_header http_500 http_502 http_503;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto https;
proxy_redirect off;

proxy_connect_timeout 300s;
proxy_send_timeout 300s;
proxy_read_timeout 300s;

}

观察nginx502的频率有一点下降,没有达到预期效果于是更改代理的缓冲区

location / {
proxy_pass http://myweb;
proxy_next_upstream error timeout invalid_header http_500 http_502 http_503;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto http;
proxy_redirect off;
proxy_connect_timeout 300s;
proxy_send_timeout 300s;
proxy_read_timeout 300s;
proxy_buffer_size 512k;
proxy_buffers 32 512k;
proxy_busy_buffers_size 512k;
proxy_temp_file_write_size 512k;
proxy_ignore_client_abort on;

}

观察nginx502的频率和刚才一样,并没有明显效果。打开nginx的错误日志,观察错误状态,nginx错误日志显示

[error] 20435#0: *3890606 no live upstreams while connecting to upstream, client:

意思是nginx发现没有存活的后端了,后端有两台服务器,这怎么可能

猜想nginx在等待后端服务器返回的时候做了判断,如果后端服务器响应慢就有可能踢掉后端服务器,因此就可能把后端的两台服务器都踢掉

所以问题是出在 upstream 配置,原来配置中的max_fails=3 fail_timeout=30s;是默认的配置

我尝试修改 max_fails=10 fail_timeout=60s; 观察nginx出现502的频率下降了很多,但是如果出现502就时间比较久。以下是upstream最终配置

upstream myweb {
server 10.10.10.1:80 max_fails=60 fail_timeout=10s;
server 10.10.10.2:80 max_fails=60 fail_timeout=10s;
ip_hash;
}

接下来可能的优化 调高调高linux内核打开文件数量备注一下

echo ‘ulimit -HSn 65536’ >> /etc/profile
echo ‘ulimit -HSn 65536’ >> /etc/rc.local
source /etc/profile

总结:第一次用nginx做代理服务器,各个参数配置不太熟悉,上线的时候没有做压力测试,导致了502问题

Comments are closed.

Post Navigation