mysql timeout详解

[TOC]

1.timeout变量知多少

打开mysql，用show variables like '%timeout%'命令一看，不看不知道，一看吓一跳，结果如下面所示，这么多timeout相关变量，一下就吓尿了。。原来对mysql的了解原来是如此的不够，好了，这么些timeout究竟各自是什么意思，花了一下午去学习，做了几个小实验，总算明白了一二，如有错误，请不吝赐教啊。

mysql> show variables like '%timeout%';
+-----------------------------+----------+
| Variable_name               | Value    |
+-----------------------------+----------+
| connect_timeout             | 10       |
| delayed_insert_timeout      | 300      |
| innodb_flush_log_at_timeout | 1        |
| innodb_lock_wait_timeout    | 50       |
| innodb_rollback_on_timeout  | OFF      |
| interactive_timeout         | 28800    |
| lock_wait_timeout           | 31536000 |
| net_read_timeout            | 30       |
| net_write_timeout           | 60       |
| rpl_stop_slave_timeout      | 31536000 |
| slave_net_timeout           | 3600     |
| wait_timeout                | 28800    |
+-----------------------------+----------+

2.分析

下面从timeout里面找些比较常用的出来逐个分析下。

2.1 connect_timeout

connect_timeout指的是连接过程中握手的超时时间，在5.0.52以后默认为10秒，之前版本默认是5秒。官方文档是这样说的：

connect_timeout: The number of seconds that the mysqld server waits for a connect packet before responding with Bad handshake. The default value is 10 seconds as of MySQL 5.0.52 and 5 seconds before that

mysql的基本原理应该是有个监听线程循环接收请求，当有请求来时，创建线程（或者从线程池中取）来处理这个请求。由于mysql连接采用TCP协议，那么之前势必是需要进行TCP三次握手的。TCP三次握手成功之后，客户端会进入阻塞，等待服务端的消息。服务端这个时候会创建一个线程(或者从线程池中取一个线程)来处理请求，主要验证部分包括host和用户名密码验证。host验证我们比较熟悉，因为在用grant命令授权用户的时候是有指定host的。用户名密码认证则是服务端先生成一个随机数发送给客户端，客户端用该随机数和密码进行多次sha1加密后发送给服务端验证。如果通过，整个连接握手过程完成。

connect timeout就是tcp连接超时其中又分两种，一种是超过了自己设置的连接超时时间一种是tcp层面连接sync包报文达到了重试次数报的超时，两种超时错误提示信息是不一样的。

2.2 interactive_timeout & wait_timeout

还是先看官方文档，从文档上来看wait_timeout和interactive_timeout都是指不活跃的连接超时时间，连接线程启动的时候wait_timeout会根据是交互模式还是非交互模式被设置为这两个值中的一个。如果我们运行mysql -uroot -p命令登陆到mysql，wait_timeout就会被设置为interactive_timeout的值。如果我们在wait_timeout时间内没有进行任何操作，那么再次操作的时候就会提示超时，因为server端已经关闭了该连接。

The number of seconds the server waits for activity on a noninteractive connection before closing it.

On thread startup, the session wait_timeout value is initialized from the global wait_timeout value or from the global interactive_timeout value, depending on the type of client (as defined by the CLIENT_INTERACTIVE connect option to mysql_real_connect()).

测试如下：

mysql> set global interactive_timeout=3; ##设置交互超时为3秒
// session级别的貌似不起作用

重新进入mysql，这时候可以看到：

mysql> show variables like '%timeout%'; ##wait_timeout已经被设置为3秒
+-----------------------------+----------+
| Variable_name               | Value    |
+-----------------------------+----------+
| connect_timeout             | 10       |
| delayed_insert_timeout      | 300      |
| innodb_flush_log_at_timeout | 1        |
| innodb_lock_wait_timeout    | 50       |
| innodb_rollback_on_timeout  | OFF      |
| interactive_timeout         | 3        |
| lock_wait_timeout           | 31536000 |
| net_read_timeout            | 30       |
| net_write_timeout           | 3        |
| rpl_stop_slave_timeout      | 31536000 |
| slave_net_timeout           | 3600     |
| wait_timeout                | 3        |
+-----------------------------+----------+

可以看到wait_timeout被设置为了interactive_timeout的值，这样，我们3秒后再执行其他命令，会提示如下：

mysql> show variables like '%timeout%';
ERROR 2006 (HY000): MySQL server has gone away  ##超时重连
No connection. Trying to reconnect...
Connection id:    50
Current database: *** NONE ***

+-----------------------------+----------+
| Variable_name               | Value    |
+-----------------------------+----------+
| connect_timeout             | 10       |
| delayed_insert_timeout      | 300      |
| innodb_flush_log_at_timeout | 1        |
| innodb_lock_wait_timeout    | 50       |
| innodb_rollback_on_timeout  | OFF      |
| interactive_timeout         | 3        |
| lock_wait_timeout           | 31536000 |
| net_read_timeout            | 30       |
| net_write_timeout           | 3        |
| rpl_stop_slave_timeout      | 31536000 |
| slave_net_timeout           | 3600     |
| wait_timeout                | 3        |
+-----------------------------+----------+

2.3 innodb_lock_wait_timeout & innodb_rollback_on_timeout

还是先祭出官方文档，从文档中看，这个值是针对innodb引擎的，是innodb中行锁的等待超时时间，默认为50秒。如果超时，则当前语句会回滚。如果设置了innodb_rollback_on_timeout，则会回滚整个事务，否则，只回滚事务等待行锁的这个语句。

The length of time in seconds an InnoDB transaction waits for a row lock before giving up. The default value is 50 seconds. A transaction that tries to access a row that is locked by another InnoDB transaction waits at most this many seconds for write access to the row before issuing the following error:

ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction

同样来测试下(先创建一个innodb引擎的表test，只有一列，列名为a)：

mysql> CREATE TABLE `testLock` ( `a` int primary key) engine=innodb;

首先插入三条测试数据

mysql> select * from test;
+---+
| a |
+---+
| 1 |
| 2 |
| 3 |

当前innodb_rollback_on_timeout=OFF，设置innodb_lock_wait_timeout=1，我们开启两个事务

##事务1 加行锁
mysql> begin;
Query OK, 0 rows affected (0.00 sec)

mysql> select * from testLock where a=2 for update;
+---+
| a |
+---+
| 2 |
+---+
1 row in set (0.01 sec)

##事务2，请求行锁
mysql> begin;
Query OK, 0 rows affected (0.00 sec)

mysql> delete from testLock where a=1;
Query OK, 1 row affected (0.00 sec)

mysql> delete from testLock where a=2; ##请求行锁超时
ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction
mysql> select * from test; # 此时还是在事务中 并未提交
+---+
| a |
+---+
| 2 |
| 3 |
+---+
2 rows in set (0.00 sec)

mysql> begin; ##这里我们直接开启另外的事务(或者直接commit当前事务)，则原来的事务只会回滚第二条语句，最终结果就是test表中只剩下2和3.如果这里我们显示的rollback，则会回滚整个事务，保持1，2，3不变。

那么如果innodb_rollback_on_timeout=ON,同样事务2会超时，但是这个时候如果我们begin开启新的事务，那么会回滚请求锁超时的整个事务，而不是像前面那样只回滚了超时的那条语句。

2.4 lock_wait_timeout

文档中描述如下，简单说来lock_wait_timeout是元数据锁等待超时，任意锁元数据的语句都会用到这个超时参数，默认为一年。元数据锁可以参加mysql metadata lock，为了保证事务可串行化，不管是myisam还是innodb引擎的表，只要是先在一个session里面开启一个事务，就会获取操作表的元数据锁，这时候如果另一个session要对表的元数据进行修改，则会阻塞直到超时。

测试例子：
我们用一个myisam引擎的表myisam_test来测试。其中有一条记录(1,1)，现在我们先开启一个session，然后执行一个select语句。另外打开一个session，然后执行表的元数据操作，如删除表，会发现操作阻塞直到lock_wait_timeout秒后提示超时。

##第一个session，获取metadata lock
mysql> show create table myisam_test;
-----------------------------------------------------------+
| Table       | Create Table                                                                                                                                |
+-----------------------------------------------------------
| myisam_test | CREATE TABLE `myisam_test` (
  `i` int(11) NOT NULL,
  `j` int(11) DEFAULT NULL,
  PRIMARY KEY (`i`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1

mysql> start transaction;
Query OK, 0 rows affected (0.00 sec)

mysql> select * from myisam_test;
+---+------+
| i | j    |
+---+------+
| 2 |    1 |
+---+------+
1 row in set (0.00 sec)

##另一个session，删除表提示超时
mysql> drop table myisam_test;
ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction

其中更改表结构的元数据操作指令有如下这些：

DROP TABLE t;
ALTER TABLE t ...;
DROP TABLE nt;
ALTER TABLE nt ...;
LOCK TABLE t ... WRITE；

2.5 net_read_timeout & net_write_timeout

文档中描述如下，就是说这两个参数在网络条件不好的情况下起作用。比如我在客户端用load data infile的方式导入很大的一个文件到数据库中，然后中途用iptables禁用掉mysql的3306端口，这个时候服务器端该连接状态是reading from net，在等待net_read_timeout后关闭该连接。同理，在程序里面查询一个很大的表时，在查询过程中同样禁用掉端口，制造网络不通的情况，这样该连接状态是writing to net，然后在net_write_timeout后关闭该连接。slave_net_timeout类似。

The number of seconds to wait for more data from a connection before aborting the read. When the server is reading from the client, net_read_timeout is the timeout value controlling when to abort. When the server is writing to the client, net_write_timeout is the timeout value controlling when to abort

测试：
我创建一个120M的数据文件data.txt。然后登陆到mysql。

mysql -uroot -h 127.0.0.1 -P 3306 --local-infile=1

导入过程设置iptables禁用3306端口。

iptables -A INPUT -p tcp --dport 3306 -j DROP
iptables -A OUTPUT -p tcp --sport 3306 -j DROP

可以看到连接状态为reading from net，然后经过net_read_timeout秒后关闭。

3.总结

经过几个实验可以发现，connect_timeout在握手认证阶段（authenticate）起作用，interactive_timeout 和wait_timeout在连接空闲阶段（sleep）起作用，而net_read_timeout和net_write_timeout则是在连接繁忙阶段（query）或者网络出现问题时起作用。

net_read_timeout和net_write_timeout这个参数只对TCP/IP链接有效，分别是数据库等待接收客户端发送网络包和发送网络包给客户端的超时时间，这是在Activity状态下的线程才有效的参数

这两个参数控制由于网络原因造成的异常超时。比如server在从client端读取大量的数据，读着读着突然发现读不到了，也没有遇到结束标识符，这种情况下，server在等待net_read_timeout秒还没读到后续数据，就断开连接；或者当server select出了大量数据发向客户端，发着发着，突然发现发不动了，客户端不接收了，而数据还没有发送完，这时server在等待net_write_timeout秒后就断开连接。

是mysql应用层的协议，不要和tcp的写超时混淆，tcp的写超时只有在丢包次数过多才会。