含义
To understand the backlog argument, we must realize that for a given listening socket, the kernel maintains two queues :
要明白backlog参数的含义,我们必须明白对于一个listening socket,kernel维护者两个队列:
1.An incomplete connection queue, which contains an entry for each SYN that has arrived from a client for which the server is awaiting completion of the TCP three-way handshake. These sockets are in the SYN_RCVD state .
1.一个未完成连接的队列,此队列维护着那些已收到了客户端SYN分节信息,等待完成三路握手的连接,socket的状态是SYN_RCVD
2.A completed connection queue, which contains an entry for each client with whom the TCP three-way handshake has completed. These sockets are in the ESTABLISHED state
2.一个已完成的连接的队列,此队列包含了那些已经完成三路握手的连接,socket的状态是ESTABLISHED
The backlog argument to the listen function has historically specified the maximum value for the sum of both queues.backlog参数历史上被定义为上面两个队列的大小之和
Berkeley-derived implementations add a fudge factor to the backlog: It is multiplied by 1.5
Berkely实现中的backlog值为上面两队列之和再乘以1.5
When a SYN arrives from a client, TCP creates a new entry on the incomplete queue and then responds with the second segment of the three-way handshake: the server's SYN with an ACK of the client's SYN (Section 2.6). This entry will remain on the incomplete queue until the third segment of the three-way handshake arrives (the client's ACK of the server's SYN), or until the entry times out. (Berkeley-derived implementations have a timeout of 75 seconds for these incomplete entries.)
当客户端的第一个SYN到达的时候,TCP会在未完成队列中增加一个新的记录然后回复给客户端三路握手中的第二个分节(服务端的SYN和针对客户端的ACK),这条记录会在未完成队列中一直存在,直到三路握手中的最后一个分节到达,或者直到超时(Berkeley时间将这个超时定义为75秒)
If the queues are full when a client SYN arrives, TCP ignores the arriving SYN (pp. 930–931 of TCPv2); it does not send an RST. This is because the condition is considered temporary, and the client TCP will retransmit its SYN, hopefully finding room on the queue in the near future. If the server TCP immediately responded with an RST, the client's connect would return an error, forcing the application to handle this condition instead of letting TCP's normal retransmission take over. Also, the client could not differentiate between an RST in response to a SYN meaning "there is no server at this port" versus "there is a server at this port but its queues are full."
如果当客户端SYN到达的时候队列已满,TCP将会忽略后续到达的SYN,但是不会给客户端发送RST信息,因为此时允许客户端重传SYN分节,如果返回错误信息,那么客户端将无法分清到底是服务端对应端口上没有相应应用程序还是服务端对应端口上队列已满这两种情况
流程
在linux 2.2以前,backlog大小包括了半连接状态和全连接状态两种队列大小。linux 2.2以后,分离为两个backlog来分别限制半连接SYN_RCVD状态的未完成连接队列大小跟全连接ESTABLISHED状态的已完成连接队列大小。互联网上常见的TCP SYN FLOOD恶意DOS攻击方式就是用/proc/sys/net/ipv4/tcp_max_syn_backlog来控制的,可参见《TCP洪水攻击(SYN Flood)的诊断和处理》。
在使用listen函数时,内核会根据传入参数的backlog跟系统配置参数/proc/sys/net/core/somaxconn中,二者取最小值,作为“ESTABLISHED状态之后,完成TCP连接,等待服务程序ACCEPT”的队列大小。在kernel 2.4.25之前,是写死在代码常量SOMAXCONN,默认值是128。在kernel 2.4.25之后,在配置文件/proc/sys/net/core/somaxconn (即 /etc/sysctl.conf 之类 )中可以修改。流程图,如下: