Ben Fyvie | 5 Oct 23:37 2009

502 errors continue

We have implemented haproxy between an nginx web server and 3 mongrel instances. With this configuration we receive intermittent 502 errors, these errors seem to only occur when a single mongrel instance is restarting (at least that is how it seems). We were running haproxy version 1.3.15.5 until I came across the following:

http://www.mail-archive.com/haproxy-JklxK3liFipg9hUCZPvPmw@public.gmane.org/msg00864.html

 

So today we upgraded to 1.3.20 in hopes that our 502 problems would be resolved, however they continue.

 

Some interesting tidbits:

  1. without haproxy we don’t receive any 502 errors

  2. prior to the upgrade to 1.3.20 we had one request to create a new record. That request returned a 502 to the end-user, but the record was actually created (this means the mongrels are processing the requests)

 

Are there steps we can take (such as specific logging options) to pinpoint what is causing haproxy to throw the 502 errors?

 

Ben Fyvie

 

 

Michael Marano | 6 Oct 19:25 2009

Kernel tuning recommendations

We’ve completed a move to Rackspace Cloud and are now using HAproxy as our load balancer.  Haproxy is a Phenomenal piece of software.

The primary issue I’ve noticed from haproxy is that my backends are frequently going DOWN/UP, and we’re having some long request times as well as serving occaisional 504’s. I’ve been doing my research and understand that I need to do some system tuning via sysctl to get things running properly.  All references have different reccomendations on what parameters to tune, and I’m a bit hesitant to copy/paste from multiple resources.  

Is there a baseline set of recommended tunings that I can apply as a first response before digging into the gritty details?

I’ve attached a bunch of details below. Thanks for any help you can provide.

Michael

-------

[mmarano <at> w1 w1]$ cat /etc/redhat-release
CentOS release 5.3 (Final)
[mmarano <at> w1 w1]$ uname -a
Linux w1.gamesradar.com 2.6.24-23-xen #1 SMP Mon Jan 26 03:09:12 UTC 2009 x86_64 x86_64 x86_64 GNU/Linux
[mmarano <at> w1 w1]$ /usr/sbin/haproxy -v
HA-Proxy version 1.3.18 2009/05/10

Here’s what I’m finding in /var/log/messages:

Oct  2 23:12:38 w1 kernel: [1556670.291082] printk: 482 messages suppressed.
Oct  2 23:12:38 w1 kernel: [1556670.291102] nf_conntrack: table full, dropping packet.
Oct  3 01:34:49 w1 kernel: [1566552.616316] possible SYN flooding on port 80. Sending cookies.
Oct  3 03:19:52 w1 kernel: [1572838.886342] printk: 294 messages suppressed.


[mmarano <at> w1 w1]$ sudo /sbin/sysctl -a | grep ^net
net.ipv4.tcp_timestamps = 1
net.ipv4.tcp_window_scaling = 1
net.ipv4.tcp_sack = 1
net.ipv4.tcp_retrans_collapse = 1
net.ipv4.ip_forward = 0
net.ipv4.ip_default_ttl = 64
net.ipv4.ip_no_pmtu_disc = 0
net.ipv4.ip_nonlocal_bind = 0
net.ipv4.tcp_syn_retries = 5
net.ipv4.tcp_synack_retries = 5
net.ipv4.tcp_max_orphans = 65536
net.ipv4.tcp_max_tw_buckets = 180000
net.ipv4.ipfrag_high_thresh = 262144
net.ipv4.ipfrag_low_thresh = 196608
net.ipv4.ip_dynaddr = 0
net.ipv4.ipfrag_time = 30
net.ipv4.tcp_keepalive_time = 7200
net.ipv4.tcp_keepalive_probes = 9
net.ipv4.tcp_keepalive_intvl = 75
net.ipv4.tcp_retries1 = 3
net.ipv4.tcp_retries2 = 15
net.ipv4.tcp_fin_timeout = 60
net.ipv4.tcp_syncookies = 1
net.ipv4.tcp_tw_recycle = 0
net.ipv4.tcp_abort_on_overflow = 0
net.ipv4.tcp_stdurg = 0
net.ipv4.tcp_rfc1337 = 0
net.ipv4.tcp_max_syn_backlog = 1024
net.ipv4.ip_local_port_range = 32768    61000
net.ipv4.icmp_echo_ignore_all = 0
net.ipv4.icmp_echo_ignore_broadcasts = 1
net.ipv4.icmp_ignore_bogus_error_responses = 1
net.ipv4.icmp_errors_use_inbound_ifaddr = 0
net.ipv4.route.min_delay = 2
net.ipv4.route.max_delay = 10
net.ipv4.route.gc_thresh = 262144
net.ipv4.route.max_size = 4194304
net.ipv4.route.gc_min_interval = 0
net.ipv4.route.gc_min_interval_ms = 500
net.ipv4.route.gc_timeout = 300
net.ipv4.route.gc_interval = 60
net.ipv4.route.redirect_load = 5
net.ipv4.route.redirect_number = 9
net.ipv4.route.redirect_silence = 5120
net.ipv4.route.error_cost = 250
net.ipv4.route.error_burst = 1250
net.ipv4.route.gc_elasticity = 8
net.ipv4.route.mtu_expires = 600
Chris Sarginson | 6 Oct 19:52 2009
Picon

Re: Kernel tuning recommendations



Sent from my iPhone

On 6 Oct 2009, at 18:25, Michael Marano <mmarano-dvMdDlB33HxWk0Htik3J/w@public.gmane.org> wrote:

We’ve completed a move to Rackspace Cloud and are now using HAproxy as our load balancer.  Haproxy is a Phenomenal piece of software.

The primary issue I’ve noticed from haproxy is that my backends are frequently going DOWN/UP, and we’re having some long request times as well as serving occaisional 504’s. I’ve been doing my research and understand that I need to do some system tuning via sysctl to get things running properly.  All references have different reccomendations on what parameters to tune, and I’m a bit hesitant to copy/paste from multiple resources.  

Is there a baseline set of recommended tunings that I can apply as a first response before digging into the gritty details?

I’ve attached a bunch of details below. Thanks for any help you can provide.

Michael

-------

[mmarano <at> w1 w1]$ cat /etc/redhat-release
CentOS release 5.3 (Final)
[mmarano <at> w1 w1]$ uname -a
Linux w1.gamesradar.com 2.6.24-23-xen #1 SMP Mon Jan 26 03:09:12 UTC 2009 x86_64 x86_64 x86_64 GNU/Linux
[mmarano <at> w1 w1]$ /usr/sbin/haproxy -v
HA-Proxy version 1.3.18 2009/05/10

Here’s what I’m finding in /var/log/messages:

Oct  2 23:12:38 w1 kernel: [1556670.291082] printk: 482 messages suppressed.
Oct  2 23:12:38 w1 kernel: [1556670.291102] nf_conntrack: table full, dropping packet.
Oct  3 01:34:49 w1 kernel: [1566552.616316] possible SYN flooding on port 80. Sending cookies.
Oct  3 03:19:52 w1 kernel: [1572838.886342] printk: 294 messages suppressed.


[mmarano <at> w1 w1]$ sudo /sbin/sysctl -a | grep ^net
net.ipv4.tcp_timestamps = 1
net.ipv4.tcp_window_scaling = 1
net.ipv4.tcp_sack = 1
net.ipv4.tcp_retrans_collapse = 1
net.ipv4.ip_forward = 0
net.ipv4.ip_default_ttl = 64
net.ipv4.ip_no_pmtu_disc = 0
net.ipv4.ip_nonlocal_bind = 0
net.ipv4.tcp_syn_retries = 5
net.ipv4.tcp_synack_retries = 5
net.ipv4.tcp_max_orphans = 65536
net.ipv4.tcp_max_tw_buckets = 180000
net.ipv4.ipfrag_high_thresh = 262144
net.ipv4.ipfrag_low_thresh = 196608
net.ipv4.ip_dynaddr = 0
net.ipv4.ipfrag_time = 30
net.ipv4.tcp_keepalive_time = 7200
net.ipv4.tcp_keepalive_probes = 9
net.ipv4.tcp_keepalive_intvl = 75
net.ipv4.tcp_retries1 = 3
net.ipv4.tcp_retries2 = 15
net.ipv4.tcp_fin_timeout = 60
net.ipv4.tcp_syncookies = 1
net.ipv4.tcp_tw_recycle = 0
net.ipv4.tcp_abort_on_overflow = 0
net.ipv4.tcp_stdurg = 0
net.ipv4.tcp_rfc1337 = 0
net.ipv4.tcp_max_syn_backlog = 1024
net.ipv4.ip_local_port_range = 32768    61000
net.ipv4.icmp_echo_ignore_all = 0
net.ipv4.icmp_echo_ignore_broadcasts = 1
net.ipv4.icmp_ignore_bogus_error_responses = 1
net.ipv4.icmp_errors_use_inbound_ifaddr = 0
net.ipv4.route.min_delay = 2
net.ipv4.route.max_delay = 10
net.ipv4.route.gc_thresh = 262144
net.ipv4.route.max_size = 4194304
net.ipv4.route.gc_min_interval = 0
net.ipv4.route.gc_min_interval_ms = 500
net.ipv4.route.gc_timeout = 300
net.ipv4.route.gc_interval = 60
net.ipv4.route.redirect_load = 5
net.ipv4.route.redirect_number = 9
net.ipv4.route.redirect_silence = 5120
net.ipv4.route.error_cost = 250
net.ipv4.route.error_burst = 1250
net.ipv4.route.gc_elasticity = 8
net.ipv4.route.mtu_expires = 600
Chris Sarginson | 6 Oct 19:56 2009
Picon

Re: Kernel tuning recommendations

The first piece of advice you will receive is to disable the nf_conntrack module :)

That should give a performance improvement.  I will send on my sysctl parameters if possible later. I would also recommend you sending in your haproxy config, and upgrading to the haproxy version in epel repos, if you don't wish to compile from source.

Chris

Sent from my iPhone

On 6 Oct 2009, at 18:25, Michael Marano <mmarano-dvMdDlB33HxWk0Htik3J/w@public.gmane.org> wrote:

We’ve completed a move to Rackspace Cloud and are now using HAproxy as our load balancer.  Haproxy is a Phenomenal piece of software.

The primary issue I’ve noticed from haproxy is that my backends are frequently going DOWN/UP, and we’re having some long request times as well as serving occaisional 504’s. I’ve been doing my research and understand that I need to do some system tuning via sysctl to get things running properly.  All references have different reccomendations on what parameters to tune, and I’m a bit hesitant to copy/paste from multiple resources.  

Is there a baseline set of recommended tunings that I can apply as a first response before digging into the gritty details?

I’ve attached a bunch of details below. Thanks for any help you can provide.

Michael

-------

[mmarano <at> w1 w1]$ cat /etc/redhat-release
CentOS release 5.3 (Final)
[mmarano <at> w1 w1]$ uname -a
Linux w1.gamesradar.com 2.6.24-23-xen #1 SMP Mon Jan 26 03:09:12 UTC 2009 x86_64 x86_64 x86_64 GNU/Linux
[mmarano <at> w1 w1]$ /usr/sbin/haproxy -v
HA-Proxy version 1.3.18 2009/05/10

Here’s what I’m finding in /var/log/messages:

Oct  2 23:12:38 w1 kernel: [1556670.291082] printk: 482 messages suppressed.
Oct  2 23:12:38 w1 kernel: [1556670.291102] nf_conntrack: table full, dropping packet.
Oct  3 01:34:49 w1 kernel: [1566552.616316] possible SYN flooding on port 80. Sending cookies.
Oct  3 03:19:52 w1 kernel: [1572838.886342] printk: 294 messages suppressed.


[mmarano <at> w1 w1]$ sudo /sbin/sysctl -a | grep ^net
net.ipv4.tcp_timestamps = 1
net.ipv4.tcp_window_scaling = 1
net.ipv4.tcp_sack = 1
net.ipv4.tcp_retrans_collapse = 1
net.ipv4.ip_forward = 0
net.ipv4.ip_default_ttl = 64
net.ipv4.ip_no_pmtu_disc = 0
net.ipv4.ip_nonlocal_bind = 0
net.ipv4.tcp_syn_retries = 5
net.ipv4.tcp_synack_retries = 5
net.ipv4.tcp_max_orphans = 65536
net.ipv4.tcp_max_tw_buckets = 180000
net.ipv4.ipfrag_high_thresh = 262144
net.ipv4.ipfrag_low_thresh = 196608
net.ipv4.ip_dynaddr = 0
net.ipv4.ipfrag_time = 30
net.ipv4.tcp_keepalive_time = 7200
net.ipv4.tcp_keepalive_probes = 9
net.ipv4.tcp_keepalive_intvl = 75
net.ipv4.tcp_retries1 = 3
net.ipv4.tcp_retries2 = 15
net.ipv4.tcp_fin_timeout = 60
net.ipv4.tcp_syncookies = 1
net.ipv4.tcp_tw_recycle = 0
net.ipv4.tcp_abort_on_overflow = 0
net.ipv4.tcp_stdurg = 0
net.ipv4.tcp_rfc1337 = 0
net.ipv4.tcp_max_syn_backlog = 1024
net.ipv4.ip_local_port_range = 32768    61000
net.ipv4.icmp_echo_ignore_all = 0
net.ipv4.icmp_echo_ignore_broadcasts = 1
net.ipv4.icmp_ignore_bogus_error_responses = 1
net.ipv4.icmp_errors_use_inbound_ifaddr = 0
net.ipv4.route.min_delay = 2
net.ipv4.route.max_delay = 10
net.ipv4.route.gc_thresh = 262144
net.ipv4.route.max_size = 4194304
net.ipv4.route.gc_min_interval = 0
net.ipv4.route.gc_min_interval_ms = 500
net.ipv4.route.gc_timeout = 300
net.ipv4.route.gc_interval = 60
net.ipv4.route.redirect_load = 5
net.ipv4.route.redirect_number = 9
net.ipv4.route.redirect_silence = 5120
net.ipv4.route.error_cost = 250
net.ipv4.route.error_burst = 1250
net.ipv4.route.gc_elasticity = 8
net.ipv4.route.mtu_expires = 600
Willy Tarreau | 6 Oct 20:36 2009
Picon

Re: 502 errors continue

Hi,

On Mon, Oct 05, 2009 at 04:37:16PM -0500, Ben Fyvie wrote:
> We have implemented haproxy between an nginx web server and 3 mongrel
> instances. With this configuration we receive intermittent 502 errors, these
> errors seem to only occur when a single mongrel instance is restarting (at
> least that is how it seems). We were running haproxy version 1.3.15.5 until
> I came across the following:
> 
> http://www.mail-archive.com/haproxy-JklxK3liFipg9hUCZPvPmw <at> public.gmane.org/msg00864.html
> 
>  
> 
> So today we upgraded to 1.3.20 in hopes that our 502 problems would be
> resolved, however they continue. 
> 
>  
> 
> Some interesting tidbits:
> 
> 1.	without haproxy we don't receive any 502 errors
> 2.	prior to the upgrade to 1.3.20 we had one request to create a new
> record. That request returned a 502 to the end-user, but the record was
> actually created (this means the mongrels are processing the requests)
> 
>  
> 
> Are there steps we can take (such as specific logging options) to pinpoint
> what is causing haproxy to throw the 502 errors?

Yes, first, you should absolutely check the logs. The response is there.
Just grep for ' 502 ' in your logs and see the flags, they will indicate
what has caused that error.

Regards,
Willy

Ben Fyvie | 6 Oct 21:53 2009

RE: 502 errors continue

Thanks Willy, 

This is what is captured in the log with "option httplog" set:

Oct  6 12:34:13 localhost haproxy[20906]: 127.0.0.1:44287
[06/Oct/2009:12:33:55.487] ourapp_staging ourapp_staging/ourapp_5002
0/0/0/-1/17769 502 204 - - SH-- 5/5/5/1/0 0/0 "GET /clients/38275/edit
HTTP/1.0"

I've pasted the relevant info from our log here as well: 
http://pastie.org/private/nfvbtq236x8jp9ynj4dg

Some more details about our environment:
We have two apps on the same server both apps use 3 mongrels.
One app uses mongrels on ports 5000 - 5002
The other app uses mongrels on ports 5100 - 5102
Both apps are configured to use haproxy.
Our haproxy.cfg can be found here:
http://pastie.org/private/oyvusbvtvxi3qpultrvla 

We have monit monitor the size of our mongrels and restart them when their
memory footprint becomes too large, we believe the 502 errors are thrown on
the last request handled by the mongrel prior to the restart, but that is
just based on observations and we can not be sure how accurate that is.

Any thoughts?

Ben Fyvie
-----Original Message-----
From: Willy Tarreau [mailto:w <at> 1wt.eu] 
Sent: Tuesday, October 06, 2009 1:37 PM
To: Ben Fyvie
Cc: haproxy@...
Subject: Re: 502 errors continue

Hi,

On Mon, Oct 05, 2009 at 04:37:16PM -0500, Ben Fyvie wrote:
> We have implemented haproxy between an nginx web server and 3 mongrel
> instances. With this configuration we receive intermittent 502 errors,
these
> errors seem to only occur when a single mongrel instance is restarting (at
> least that is how it seems). We were running haproxy version 1.3.15.5
until
> I came across the following:
> 
> http://www.mail-archive.com/haproxy-JklxK3liFipg9hUCZPvPmw <at> public.gmane.org/msg00864.html
> 
>  
> 
> So today we upgraded to 1.3.20 in hopes that our 502 problems would be
> resolved, however they continue. 
> 
>  
> 
> Some interesting tidbits:
> 
> 1.	without haproxy we don't receive any 502 errors
> 2.	prior to the upgrade to 1.3.20 we had one request to create a new
> record. That request returned a 502 to the end-user, but the record was
> actually created (this means the mongrels are processing the requests)
> 
>  
> 
> Are there steps we can take (such as specific logging options) to pinpoint
> what is causing haproxy to throw the 502 errors?

Yes, first, you should absolutely check the logs. The response is there.
Just grep for ' 502 ' in your logs and see the flags, they will indicate
what has caused that error.

Regards,
Willy

Michael Marano | 6 Oct 22:24 2009

Re: Kernel tuning recommendations

Chris,

Thanks for the swift reply.

I¹m running iptables on this server as well to provide basic firewalling.
Can I safely disable conntrack or any other netfilter modules?

I¹ve upgraded our staging site for some testing and will upgrade in
production tonight.  I¹ve included my haproxy.cfg below. It¹s a bit lengthy,
but http_default is the important backend.

Michael Marano

------

global
    log 127.0.0.1   local0 info # frontend logs -> connections
    log 127.0.0.1   local1 notice notice # backend logs -> UP and DOWN
    maxconn 100000
    #chroot /usr/share/haproxy
    user haproxy
    group haproxy
    spread-checks 5
    daemon
    #debug
    #quiet

defaults http-in
    log    global
    mode    http
    option    httplog
    option    dontlognull
    option     forwardfor
    option     httpclose
    retries    3
    option redispatch
    maxconn    100000
    timeout connect 5000
    timeout http-request 5000
    timeout client 50000
    timeout server 50000
    stats enable
    #stats scope    .
    stats uri    /haproxy?stats
    stats realm    Haproxy\ Statistics
    stats auth    **********:*************

####################
#
# frontends
#
####################

frontend http_proxy
    bind     :80
    log    global
    capture request header Host len 30

    ####################
    # access control rules
    ####################

    # static 
    acl static_dom hdr_dom(host) static
    acl beta_static_dom hdr_beg(host) static.beta.gamesradar.com
    acl base_static_dom hdr_beg(host) static.gamesradar.com

    # api
    acl api_dom hdr_dom(host) api
    acl beg_api_dom hdr_beg(host) api

    # user images
    acl m1_dom hdr_dom(host) m1
    acl m2_dom hdr_dom(host) m2
    acl m3_dom hdr_dom(host) m3

    # forums
    acl forums_url url_beg /forums
    acl forums_dom hdr_dom(host) forums
    acl forum_dom hdr_dom(host) forum

    # newsletterapi
    acl newsletterapi_url url_beg /newsletterapi

    ####################
    # backend mappings
    ####################

    # static
    use_backend static_http if static_dom
    use_backend static_http if beta_static_dom
    use_backend static_http if base_static_dom

    # api 
    use_backend  tomcat_http_default if api_dom
    use_backend  tomcat_http_default if beg_api_dom

    # user images
    use_backend  m1_http if m1_dom
    use_backend  m2_http if m2_dom
    use_backend  m3_http if m3_dom

    # forums
    use_backend  forums_http if forums_url

    # newsletterapi
    use_backend  tomcat_http_default if newsletterapi_url

    # default
    default_backend http_default

frontend tomcat_http_proxy
    bind     :8080
    log    global
    capture request header Host len 30

    # default
    default_backend tomcat_http_default

####################
#
# backends
#
####################

backend http_default
    option    httpchk OPTIONS /test.jsp HTTP/1.1\r\nHost:\
www.gamesradar.com
    option     redispatch
    balance roundrobin
    #fullconn 1000
    server    a1_w a1.gamesradar.com:80 check port 80 inter 4000 fall 3 rise
2 maxconn 350
    server    a2_w a2.gamesradar.com:80 check port 80 inter 4000 fall 3 rise
2 maxconn 350
    server    a3_w a3.gamesradar.com:80 check port 80 inter 4000 fall 3 rise
2 maxconn 350

backend tomcat_http_default
    option    httpchk OPTIONS /test.jsp HTTP/1.1\r\nHost:\
www.gamesradar.com
    option     redispatch
    balance roundrobin
    #fullconn 1000
    server    a1_t a1.gamesradar.com:8080 check port 8080 inter 4000 fall 3
rise 2 maxconn 195
    server    a2_t a2.gamesradar.com:8080 check port 8080 inter 4000 fall 3
rise 2 maxconn 195
    server    a3_t a3.gamesradar.com:8080 check port 8080 inter 4000 fall 3
rise 2 maxconn 195

backend forums_http
    option    httpchk HEAD /forums/help/help-text.html HTTP/1.1\r\nHost:\
www.gamesradar.com
    option     redispatch
    balance roundrobin
    #fullconn 1000
    server    f1_w f.gamesradar.com:80 check inter 2000 maxconn 295

backend static_http
    option    httpchk HEAD /test.html
    option     redispatch
    balance roundrobin
    server    w1_static w1.gamesradar.com:81 check port 81 inter 2000
maxconn 1020
    server    w2_static w2.gamesradar.com:81 check port 81 inter 2000
maxconn 1020

# primary/failover setup for m1.gamesradar.com
backend m1_http
    option    httpchk OPTIONS / HTTP/1.1\r\nHost:\ m1.gamesradar.com
    option     redispatch
    server    m1_w a1.gamesradar.com:80 check inter 2000 maxconn 300
    server    m1_bk1 a2.gamesradar.com:80 check inter 2000 backup maxconn
300
    server    m1_bk2 a3.gamesradar.com:80 check inter 2000 backup maxconn
300

# primary/failover setup for m2.gamesradar.com
backend m2_http
    option    httpchk OPTIONS / HTTP/1.1\r\nHost:\ m2.gamesradar.com
    option     redispatch
    server    m2_w a2.gamesradar.com:80 check inter 2000 maxconn 300
    server    m2_bk1 a1.gamesradar.com:80 check inter 2000 backup maxconn
300
    server    m2_bk2 a3.gamesradar.com:80 check inter 2000 backup maxconn
300

# primary/failover setup for m3.gamesradar.com
backend m3_http
    option    httpchk OPTIONS / HTTP/1.1\r\nHost:\ m3.gamesradar.com
    option     redispatch
    server    m3_w a3.gamesradar.com:80 check inter 2000 maxconn 300
    server    m3_bk1 a1.gamesradar.com:80 check inter 2000 backup maxconn
300
    server    m3_bk2 a2.gamesradar.com:80 check inter 2000 backup maxconn
300

From: Chris Sarginson <chris@...>
Date: Tue, 6 Oct 2009 18:56:19 +0100
To: Michael Marano <mmarano@...>
Cc: "<haproxy@...>" <haproxy@...>
Subject: Re: Kernel tuning recommendations

The first piece of advice you will receive is to disable the nf_conntrack
module :)

That should give a performance improvement.  I will send on my sysctl
parameters if possible later. I would also recommend you sending in your
haproxy config, and upgrading to the haproxy version in epel repos, if you
don't wish to compile from source.

Chris

Sent from my iPhone

On 6 Oct 2009, at 18:25, Michael Marano <mmarano@...> wrote:

> We¹ve completed a move to Rackspace Cloud and are now using HAproxy as our
> load balancer.  Haproxy is a Phenomenal piece of software.
> 
> The primary issue I¹ve noticed from haproxy is that my backends are frequently
> going DOWN/UP, and we¹re having some long request times as well as serving
> occaisional 504¹s. I¹ve been doing my research and understand that I need to
> do some system tuning via sysctl to get things running properly.  All
> references have different reccomendations on what parameters to tune, and I¹m
> a bit hesitant to copy/paste from multiple resources.
> 
> Is there a baseline set of recommended tunings that I can apply as a first
> response before digging into the gritty details?
> 
> I¹ve attached a bunch of details below. Thanks for any help you can provide.
> 
> Michael
> 
> -------
> 
> [mmarano <at> w1 w1]$ cat /etc/redhat-release
> CentOS release 5.3 (Final)
> [mmarano <at> w1 w1]$ uname -a
> Linux w1.gamesradar.com <http://w1.gamesradar.com>  2.6.24-23-xen #1 SMP Mon
> Jan 26 03:09:12 UTC 2009 x86_64 x86_64 x86_64 GNU/Linux
> [mmarano <at> w1 w1]$ /usr/sbin/haproxy -v
> HA-Proxy version 1.3.18 2009/05/10
> 
> Here¹s what I¹m finding in /var/log/messages:
> 
> Oct  2 23:12:38 w1 kernel: [1556670.291082] printk: 482 messages suppressed.
> Oct  2 23:12:38 w1 kernel: [1556670.291102] nf_conntrack: table full, dropping
> packet.
> Oct  3 01:34:49 w1 kernel: [1566552.616316] possible SYN flooding on port 80.
> Sending cookies.
> Oct  3 03:19:52 w1 kernel: [1572838.886342] printk: 294 messages suppressed.
> 
> 
> [mmarano <at> w1 w1]$ sudo /sbin/sysctl -a | grep ^net
> net.ipv4.tcp_timestamps = 1
> net.ipv4.tcp_window_scaling = 1
> net.ipv4.tcp_sack = 1
> net.ipv4.tcp_retrans_collapse = 1
> net.ipv4.ip_forward = 0
> net.ipv4.ip_default_ttl = 64
> net.ipv4.ip_no_pmtu_disc = 0
> net.ipv4.ip_nonlocal_bind = 0
> net.ipv4.tcp_syn_retries = 5
> net.ipv4.tcp_synack_retries = 5
> net.ipv4.tcp_max_orphans = 65536
> net.ipv4.tcp_max_tw_buckets = 180000
> net.ipv4.ipfrag_high_thresh = 262144
> net.ipv4.ipfrag_low_thresh = 196608
> net.ipv4.ip_dynaddr = 0
> net.ipv4.ipfrag_time = 30
> net.ipv4.tcp_keepalive_time = 7200
> net.ipv4.tcp_keepalive_probes = 9
> net.ipv4.tcp_keepalive_intvl = 75
> net.ipv4.tcp_retries1 = 3
> net.ipv4.tcp_retries2 = 15
> net.ipv4.tcp_fin_timeout = 60
> net.ipv4.tcp_syncookies = 1
> net.ipv4.tcp_tw_recycle = 0
> net.ipv4.tcp_abort_on_overflow = 0
> net.ipv4.tcp_stdurg = 0
> net.ipv4.tcp_rfc1337 = 0
> net.ipv4.tcp_max_syn_backlog = 1024
> net.ipv4.ip_local_port_range = 32768    61000
> net.ipv4.icmp_echo_ignore_all = 0
> net.ipv4.icmp_echo_ignore_broadcasts = 1
> net.ipv4.icmp_ignore_bogus_error_responses = 1
> net.ipv4.icmp_errors_use_inbound_ifaddr = 0
> net.ipv4.route.min_delay = 2
> net.ipv4.route.max_delay = 10
> net.ipv4.route.gc_thresh = 262144
> net.ipv4.route.max_size = 4194304
> net.ipv4.route.gc_min_interval = 0
> net.ipv4.route.gc_min_interval_ms = 500
> net.ipv4.route.gc_timeout = 300
> net.ipv4.route.gc_interval = 60
> net.ipv4.route.redirect_load = 5
> net.ipv4.route.redirect_number = 9
> net.ipv4.route.redirect_silence = 5120
> net.ipv4.route.error_cost = 250
> net.ipv4.route.error_burst = 1250
> net.ipv4.route.gc_elasticity = 8
> net.ipv4.route.mtu_expires = 600

Stefan Johansson | 6 Oct 22:27 2009
Picon

RE: Kernel tuning recommendations

If you need iptables and don't want to turn off the conntrack module, you can try the NOTRACK target in the PREROUTING chain.
If you set notrack on the ports handled by haproxy (e.g. 80) none of that traffic should be recorded in the tables.
 
iptables -t raw -A PREROUTING -p tcp --dport 80 -j NOTRACK
 
/Stefan
 
Date: Tue, 6 Oct 2009 10:25:41 -0700
Subject: Kernel tuning recommendations
From: mmarano-dvMdDlB33HxWk0Htik3J/w@public.gmane.org
To: haproxy-JklxK3liFipg9hUCZPvPmw@public.gmane.org

We’ve completed a move to Rackspace Cloud and are now using HAproxy as our load balancer.  Haproxy is a Phenomenal piece of software.

The primary issue I’ve noticed from haproxy is that my backends are frequently going DOWN/UP, and we’re having some long request times as well as serving occaisional 504’s. I’ve been doing my research and understand that I need to do some system tuning via sysctl to get things running properly.  All references have different reccomendations on what parameters to tune, and I’m a bit hesitant to copy/paste from multiple resources.  

Is there a baseline set of recommended tunings that I can apply as a first response before digging into the gritty details?

I’ve attached a bunch of details below. Thanks for any help you can provide.

Michael

-------

[mmarano <at> w1 w1]$ cat /etc/redhat-release
CentOS release 5.3 (Final)
[mmarano <at> w1 w1]$ uname -a
Linux w1.gamesradar.com 2.6.24-23-xen #1 SMP Mon Jan 26 03:09:12 UTC 2009 x86_64 x86_64 x86_64 GNU/Linux
[mmarano <at> w1 w1]$ /usr/sbin/haproxy -v
HA-Proxy version 1.3.18 2009/05/10

Here’s what I’m finding in /var/log/messages:

Oct  2 23:12:38 w1 kernel: [1556670.291082] printk: 482 messages suppressed.
Oct  2 23:12:38 w1 kernel: [1556670.291102] nf_conntrack: table full, dropping packet.
Oct  3 01:34:49 w1 kernel: [1566552.616316] possible SYN flooding on port 80. Sending cookies.
Oct  3 03:19:52 w1 kernel: [1572838.886342] printk: 294 messages suppressed.


[mmarano <at> w1 w1]$ sudo /sbin/sysctl -a | grep ^net
net.ipv4.tcp_timestamps = 1
net.ipv4.tcp_window_scaling = 1
net.ipv4.tcp_sack = 1
net.ipv4.tcp_retrans_collapse = 1
net.ipv4.ip_forward = 0
net.ipv4.ip_default_ttl = 64
net.ipv4.ip_no_pmtu_disc = 0
net.ipv4.ip_nonlocal_bind = 0
net.ipv4.tcp_syn_retries = 5
net.ipv4.tcp_synack_retries = 5
net.ipv4.tcp_max_orphans = 65536
net.ipv4.tcp_max_tw_buckets = 180000
net.ipv4.ipfrag_high_thresh = 262144
net.ipv4.ipfrag_low_thresh = 196608
net.ipv4.ip_dynaddr = 0
net.ipv4.ipfrag_time = 30
net.ipv4.tcp_keepalive_time = 7200
net.ipv4.tcp_keepalive_probes = 9
net.ipv4.tcp_keepalive_intvl = 75
net.ipv4.tcp_retries1 = 3
net.ipv4.tcp_retries2 = 15
net.ipv4.tcp_fin_timeout = 60
net.ipv4.tcp_syncookies = 1
net.ipv4.tcp_tw_recycle = 0
net.ipv4.tcp_abort_on_overflow = 0
net.ipv4.tcp_stdurg = 0
net.ipv4.tcp_rfc1337 = 0
net.ipv4.tcp_max_syn_backlog = 1024
net.ipv4.ip_local_port_range = 32768    61000
net.ipv4.icmp_echo_ignore_all = 0
net.ipv4.icmp_echo_ignore_broadcasts = 1
net.ipv4.icmp_ignore_bogus_error_responses = 1
net.ipv4.icmp_errors_use_inbound_ifaddr = 0
net.ipv4.route.min_delay = 2
net.ipv4.route.max_delay = 10
net.ipv4.route.gc_thresh = 262144
net.ipv4.route.max_size = 4194304
net.ipv4.route.gc_min_interval = 0
net.ipv4.route.gc_min_interval_ms = 500
net.ipv4.route.gc_timeout = 300
net.ipv4.route.gc_interval = 60
net.ipv4.route.redirect_load = 5
net.ipv4.route.redirect_number = 9
net.ipv4.route.redirect_silence = 5120
net.ipv4.route.error_cost = 250
net.ipv4.route.error_burst = 1250
net.ipv4.route.gc_elasticity = 8
net.ipv4.route.mtu_expires = 600

Windows Live: Keep your friends up to date with what you do online.
Michael Marano | 6 Oct 22:49 2009

Re: Kernel tuning recommendations

Stefan,

That seems to have eliminated any log messages in my staging environment under a load test.  I think that will do the trick. Thanks for your help.

Any general recommendations for sysctl settings would still be appreciated.  This is the first time I’ve had to tune the kernel settings so any guidance will help.

Michael Marano

--
Senior Manager of Web Development
Future US, Inc.
desk:    650-238-2530
cell:    650-580-2132
twitter: <at> mmarano
aim:     michaelvicmarano
skype:   mmarano



From: Stefan Johansson <phunqe <at> hotmail.com>
Date: Tue, 6 Oct 2009 22:27:49 +0200
To: Michael Marano <mmarano <at> futureus.com>, <haproxy-JklxK3liFipg9hUCZPvPmw@public.gmane.org>
Subject: RE: Kernel tuning recommendations

iptables -t raw -A PREROUTING -p tcp --dport 80 -j NOTRACK
Michael Marano | 7 Oct 00:04 2009

Re: Kernel tuning recommendations

Subsequent load tests proved me wrong.  I¹m still getting the nf_conntrack
messages.  Perhaps I¹ve misconfigigured my iptables rules?

# bits of /var/log/messages

Oct  6 21:58:40 w1 kernel: [3718555.091684] printk: 2 messages suppressed.
Oct  6 21:58:40 w1 kernel: [3718555.091705] nf_conntrack: table full,
dropping packet.
Oct  6 21:58:41 w1 kernel: [3718290.353966] device eth0 entered promiscuous
mode
Oct  6 21:58:43 w1 kernel: [3718558.070993] nf_conntrack: table full,
dropping packet.
Oct  6 21:58:44 w1 kernel: [3718559.097679] nf_conntrack: table full,
dropping packet.

I¹ve got this in a shell script:

----
#!/bin/sh

sudo /sbin/iptables -F
sudo /sbin/iptables -A INPUT -i lo -j ACCEPT
sudo /sbin/iptables -A INPUT -i ! lo -d 127.0.0.0/8 -j REJECT
sudo /sbin/iptables -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT
sudo /sbin/iptables -A OUTPUT -j ACCEPT

# tell iptables to skip tracking on ports haproxy is monitoring
sudo /sbin/iptables -t raw -A PREROUTING -p tcp --dport 80 -j NOTRACK
sudo /sbin/iptables -t raw -A PREROUTING -p tcp --dport 8080 -j NOTRACK

# ... Rules to allow stuff...

sudo /sbin/iptables -A INPUT -j REJECT
sudo /sbin/iptables -A FORWARD -j REJECT
------

But then when I list my tables, I¹m not seeing anything about the  NOTRACK
rules.  

-----
Chain INPUT (policy ACCEPT)
target     prot opt source               destination
ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0
REJECT     all  --  0.0.0.0/0            127.0.0.0/8         reject-with
icmp-port-unreachable
ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0           state
RELATED,ESTABLISHED
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0           tcp dpt:81
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0           tcp dpt:80
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0           tcp dpt:443
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0           tcp dpt:8080

ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0           state NEW tcp
dpt:22 
ACCEPT     icmp --  0.0.0.0/0            0.0.0.0/0           icmp type 8
REJECT     all  --  0.0.0.0/0            0.0.0.0/0           reject-with
icmp-port-unreachable

Chain FORWARD (policy ACCEPT)
target     prot opt source               destination
REJECT     all  --  0.0.0.0/0            0.0.0.0/0           reject-with
icmp-port-unreachable

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination
ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0

Chain RH-Firewall-1-INPUT (0 references)
target     prot opt source               destination

-----

Michael Marano

From: Michael Marano <mmarano@...>
Date: Tue, 06 Oct 2009 13:49:02 -0700
To: Stefan Johansson <phunqe@...>, <haproxy@...>
Conversation: Kernel tuning recommendations
Subject: Re: Kernel tuning recommendations

Stefan,

That seems to have eliminated any log messages in my staging environment
under a load test.  I think that will do the trick. Thanks for your help.

Any general recommendations for sysctl settings would still be appreciated.
This is the first time I¹ve had to tune the kernel settings so any guidance
will help.

Michael Marano

From: Stefan Johansson <phunqe@...>
Date: Tue, 6 Oct 2009 22:27:49 +0200
To: Michael Marano <mmarano@...>, <haproxy@...>
Subject: RE: Kernel tuning recommendations

iptables -t raw -A PREROUTING -p tcp --dport 80 -j NOTRACK


Gmane