Introduction
Nginx is an open-source, high-performance HTTP server and reverse proxy, and an IMAP/POP3 proxy server. Nginx was built with optimization in mind, but there is really no way to optimize Nginx itself. Rather, you can modify some of the default configuration options to help handle high traffic loads.
Besides Nginx, the often overlooked main component of any running site is the server it runs on. Rather than focusing only on the software in this case, this article will discuss some system changes that can help as well; our system being a GNU/Linux distribution.
Common Tools
When you are ready to make changes, there are a few common tools that you can use to understand what is currently happening in the system and the effects of your changes to the system.
vmstat
Use the vmstat
command to run a report of system activity:
[root@ks4001893 ~]# vmstat 3 10procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu----- r b swpd free buff cache si so bi bo in cs us sy id wa st 0 0 188 1735960 470580 5069636 0 0 4 9 3 1 0 0 100 0 0 0 0 188 1736040 470580 5069660 0 0 0 105 72 100 0 0 100 0 0 0 0 188 1736212 470580 5069664 0 0 0 5 49 104 0 0 100 0 0 0 0 188 1736244 470580 5069664 0 0 0 4 50 109 0 0 100 0 0 0 0 188 1736384 470580 5069668 0 0 0 0 40 83 0 0 100 0 0 0 0 188 1736384 470580 5069668 0 0 0 4 51 114 0 0 100 0 0 0 0 188 1736260 470580 5069672 0 0 0 1 47 109 0 0 100 0 0 0 0 188 1736260 470580 5069676 0 0 0 4 46 95 0 0 100 0 0 0 0 188 1736136 470580 5069676 0 0 0 8 44 86 0 0 100 0 0 0 0 188 1736136 470580 5069680 0 0 0 12 53 110 0 0 100 0 0[root@ks4001893 ~]#
When you use this tool, you should use a different tool to analyze the Nginx server to leave what is happening with the system during Nginx load. Specifically, pay attention to the CPU and I/O sections. Under CPU, a good standard is to have an 80/20 system to user utilization ratio during Nginx high load. Additionally if the wa (wait) column begins to increase substantially, you will want to look into optimizing disk read/write of other tools coupled with Nginx (an RDBMS, for example).
strace
Use strace
to debug or monitor system calls used by running programs and processes:
[root@ks4001893 ~]# strace -vvv -p 21422 -rProcess 21422 attached - interrupt to quit 0.000000 epoll_wait(27, {{EPOLLIN, {u32=27129504, u64=27129504}}}, 512, 4294967295) = 1 3.620656 accept4(18, {sa_family=AF_INET, sin_port=htons(64009), sin_addr=inet_addr("72.32.146.52")}, [16], SOCK_NONBLOCK) = 17 0.000052 epoll_ctl(27, EPOLL_CTL_ADD, 17, {EPOLLIN|EPOLLET, {u32=27130464, u64=27130464}}) = 0 0.000028 epoll_wait(27, {{EPOLLIN, {u32=27130464, u64=27130464}}}, 512, 60000) = 1 0.024942 recvfrom(17, "GET / HTTP/1.1\r\nHost: rackspace.com\r\n"..., 1024, 0, NULL, NULL) = 423
The preceding response shows part of an strace
of Nginx. The line 3.620656 accept4
shows the address initiating the connection and the file descriptor (17) provided to this connection. Later at the line beginning with 0.024942 recvfrom
, you see the file descriptor again and a HTTP GET request.
Although cryptic, strace
is a very powerful debugging tool for when you want to really understand the user-to-kernel space activity.
tcpdump/wireshark
Use tcpdump/wireshark
to grad snapshots of network traffic.
Browser Development Tools
System Log [/var/log/messages, /var/log/syslog, and dmesg -c]
Tuning the operating system
When you perform tuning, the best practice is to make the changeand then measure the result. It is important to be remember that changes in one area affect another area most of the time. If the change does not make a difference in what you are analyzing for tuning, undo the change. It is important to remember this because of the optimizations of modern kernels and software.
Configuring Tunable Options
/proc
: When you use/proc
virtual filesystem to configure tunable options, the changes are not persistent. Using /proc
to see the effects of a change prior to making it permanent and allows for quick recovery should any issues arise from the change.
[root@ks4001893 ~]# cat /proc/sys/net/ipv4/tcp_syncookies1[root@ks4001893 ~]# echo 0 > /proc/sys/net/ipv4/tcp_syncookies[root@ks4001893 ~]# cat /proc/sys/net/ipv4/tcp_syncookies0[root@ks4001893 ~]#
sysctl.conf
: After you confirm that the change will help the system run efficiently, make the change persistent. This is done by adding the setting to the /etc/sysctl.conf
file. After you make the change, run sysctl -p
to load the new changes immediately.
[root@ks4001893 ~]# cat /proc/sys/net/ipv4/tcp_syncookies0[root@ks4001893 ~]# vim /etc/sysctl.conf[root@ks4001893 ~]# sysctl -pnet.ipv4.ip_forward = 1net.ipv4.conf.default.rp_filter = 1net.ipv4.conf.default.accept_source_route = 0kernel.sysrq = 0kernel.core_uses_pid = 1net.ipv4.tcp_syncookies = 1...[root@ks4001893 ~]# cat /proc/sys/net/ipv4/tcp_syncookies1[root@ks4001893 ~]#
Basic Tuning Vectors
Backlog queue: Limits number of pending connections; tune for high rate of incoming connections
HTTP Connection: SYN/SYNACK/ACK
SYN/SYNACK [syn_backlog queue] or syncookie
ACK [listen backlog queue] Nginx: accept()
net.ipv4.tcp_max_syn_backlog
net.ipv4.tcp_syncookies
net.core.somaxconn
Nginx: listen backlog = 1024;
net.core.netdev_max_backlog
File descriptors: Limits number of active connections; tune for many simultaneous connections
fs.file_max
/etc/security/limits.conf
worker_rlimit_nofile 40960;
Ephemeral ports: Limits number of upstream connections; tune when proxying, especially without keepalives
Each TCP connection requires a 4-tuple: [src_ip:src_port, dst_ip:dst_port]
net.ipv4.ip_local_port_range
net.ipv4.tcp_fin_timeout
Tuning Nginx
Workers, Connections, Clients
[root@ks4001893 ~]# grep processor /proc/cpuinfo | wc -l4[root@ks4001893 ~]#[root@ks4001893 ~]# vim /etc/nginx/nginx.confworker_processes 4;worker_connections 1024;
Note: We reccommend using the auto value for <code>worker_processes</code> as this will cause Nginx to use one worker per CPU core automagicaly.
Buffers
client_body_buffer_size 10k;client_header_buffer_size 1k;client_max_body_size 8m;large_client_header_buffers 2 1k;
client_body_buffer_size
:Sets buffer size for reading the client request body. In case the request body is larger than the buffer, the whole body or only its part is written to a temporary file.
client_header_buffer_size
: Sets the buffer size for reading the client request header. For most requests, a buffer of 1 kilobyte bytes is enough. However, if a request includes long cookies, or comes from a WAP client, it may not fit into 1 kilobyte. If a request line or a request header field does not fit into this buffer then larger buffers, configured by thelarge_client_header_buffers
parameter, are allocated.
client_max_body_size
: Sets the maximum allowed size of the client request body, specified in the content-Length
request header field.
large_client_header_buffers
: Sets the maximum number and size of buffers used for reading large client request headers. A request line cannot exceed the size of one buffer, or the 414 (Request-URI Too Large)
error is returned to the client. Buffers are allocated only on demand. By default, the buffer size is equal to 8 kilobytes bytes. If after the end of request processing a connection is transitioned into the keep-alive state, these buffers are released.
Timeouts
client_body_timeout 12;client_header_timeout 12;keepalive_timeout 15;send_timeout 10;
client_body_timeout
: Defines a timeout for reading the client request body. The timeout is set only for a period between two successive read operations, not for the transmission of the whole request body.
client_header_timeout
: Defines a timeout for reading client request header.
keepalive_timeout
: The first parameter sets a timeout during which a keep-alive client connection will stay open on the server side. A value of 0 disables keep-alive client connections. The optional second parameter sets a value in the keep-Alive: timeout=time
response header field.
send_timeout
: Sets a timeout for transmitting a response to the client. The timeout is set only between two successive write operations, not for the transmission of the whole response.
Gzip and Expires Header
Enabling gzip reduces the amount of data transferred over the network, which improves speed. The expires
header assists in avoiding unnecessary requests for static content, caching it for a set amount of time. The expires
header can be set inside of the http, server, or location blocks.
gzip on;gzip_min_length 1100;gzip_buffers 4 32k;gzip_types text/plain application/x-javascript text/xml text/css;location ~* .(jpg|jpeg|png|gif|ico|css|js)$ { expires 365d;}
Open File Cache
Enabling the open_file_cache
parameter enables you to cache open file descriptors, frequently accessed files, and file information with their size and modification time, among other things.
open_file_cache max=1000 inactive=20s;open_file_cache_valid 30s;open_file_cache_min_uses 2;open_file_cache_errors on;
Miscellaneous
accept_mutex off;
accept mutex
is a configurable parameter used to help distribute incoming traffic evenly across the worker_processes, but it carries a small amount of CPU overhead. Disabling this parameter for busy services may result in a speed increase.
When proxying, use keep-alives. As Nginx proxies traffic to an upstream server, it doesn't use HTTP keep-alives connections but rather one-shot TCP connections which can be inefficient in terms of the use of ephemeral ports and latency in establishing the TCP connection(s) with the upstream server(s).
Additional Resources
https://community.rackspace.com/products/f/25/t/7377
http://www.linuxmanpages.com/man8/sysctl.8.php
https://www.kernel.org/doc/Documentation/kernel-parameters.txt
http://cr.yp.to/syncookies.html
http://nginx.org/en/docs/
http://wiki.nginx.org/Configuration
https://deviceatlas.com/blog/image-optimization-using-deviceatlas-nginx-module
https://gist.github.com/denji/8359866