使用多个 Redis 的 Nginx 上游出现了“连接超时”错误。

我有一个带有4个redis服务器的nginx upstream。有时候我会在nginx的错误日志中看到这样的错误(每分钟20-30次,仅对upstream中的第一个和第二个服务器):

……在连接到上游期间,上游超时(110:连接超时)......上游:“redis2://AAA.BBB.CCC.DDD:6379”...

我的redis服务器和nginx的负载平均值均小于1,在CentOS 6.6上运行;我的nginx的RPS为250-350。

这些错误的原因是什么?提前感谢。

nginx.conf

user nginx;
worker_processes  4;
timer_resolution 100ms;
worker_priority -15;
worker_rlimit_nofile 200000;

error_log  /var/log/nginx/error.log;
pid        /var/run/nginx.pid;

events {
  worker_connections  65536;
  use epoll;
  multi_accept on;
}

http {

  include       /etc/nginx/mime.types;
  default_type  application/octet-stream;

  access_log    /var/log/nginx/access.log;

  sendfile on;
  tcp_nopush on;
  tcp_nodelay on;

  keepalive_timeout  65;

  gzip  on;
  gzip_http_version 1.0;
  gzip_comp_level 2;
  gzip_proxied any;
  gzip_vary off;
  gzip_types text/plain text/css application/x-javascript text/xml application/xml application/rss+xml application/atom+xml text/javascript application/javascript application/json text/mathml;
  gzip_min_length  1000;
  gzip_disable     "MSIE [1-6]\.";

  server_names_hash_bucket_size 64;
  types_hash_max_size 2048;
  types_hash_bucket_size 64;

   include /etc/nginx/sites-enabled/*;
}

upstream config:

upstream redis_cluster {
    server redis1.mydomain.com:6379 max_fails=0 fail_timeout=1s weight=4;
    server redis2.mydomain.com:6379 max_fails=0 fail_timeout=1s weight=4;
    server redis3.mydomain.com:6379 max_fails=0 fail_timeout=1s weight=4;
    server redis4.mydomain.com:6379 max_fails=0 fail_timeout=1s weight=4;
}

sysctl.conf(仅在nginx上编辑)

net.ipv4.ip_local_port_range = 1024 65535
net.ipv4.conf.all.accept_redirects = 0
net.ipv4.conf.all.secure_redirects = 0
net.ipv4.conf.all.send_redirects = 0
net.ipv4.tcp_max_syn_backlog = 20480
net.ipv4.tcp_synack_retries = 2
net.ipv4.tcp_rmem = 4096 87380 16777216
net.ipv4.tcp_wmem = 4096 65536 16777216
net.netfilter.nf_conntrack_max = 1048576
net.nf_conntrack_max = 1048576
net.ipv4.tcp_congestion_control = htcp
net.ipv4.tcp_no_metrics_save = 1
net.ipv4.tcp_tw_reuse = 1
net.core.somaxconn = 15000
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.ipv4.tcp_keepalive_time = 1800
net.ipv4.tcp_keepalive_intvl = 15
net.ipv4.tcp_keepalive_probes = 5

sysctl.conf(在redis服务器上,实际上是相同的,仅编辑)

vm.overcommit_memory = 1
net.ipv4.ip_local_port_range = 1024 65535
net.ipv4.conf.all.accept_redirects = 0
net.ipv4.conf.all.secure_redirects = 0
net.ipv4.conf.all.send_redirects = 0
net.ipv4.tcp_max_syn_backlog = 20480
net.ipv4.tcp_synack_retries = 2
net.ipv4.tcp_rmem = 4096 87380 16777216
net.ipv4.tcp_wmem = 4096 65536 16777216
net.netfilter.nf_conntrack_max = 1048576
net.nf_conntrack_max = 1048576
net.ipv4.tcp_congestion_control = htcp
net.ipv4.tcp_no_metrics_save = 1
net.ipv4.tcp_tw_reuse = 1
net.core.somaxconn = 15000
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.ipv4.tcp_keepalive_time = 1800
net.ipv4.tcp_keepalive_intvl = 15
net.ipv4.tcp_keepalive_probes = 5
点赞