在用python的MySQLdb库操作mysql数据库时,使用executemany可以批量插入数据,极大的提高写入速度。今天我在使用executemany的使用,碰到了两个坑,做一下记录:
1. executemany和on duplicate key update 合用时:
表结构为:
| test | CREATE TABLE `test` (
`id` varchar(20) NOT NULL,
`max_score` int(10) NOT NULL DEFAULT '0',
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='max_score' |
正确的写法是:
params = [("ua", 60), ("ub", 70), ("ua", 80)]
cur.executemany("""\
insert into test (id, max_score)
values (%s, %s) on duplicate key update
id = if(max_score>values(max_score), id, values(id)),
max_score = if(max_score>values(max_score), max_score, values(max_score))
""", params)
错误的写法是:
params = [(item[0], item[1], item[1], item[0], item[1], item[1]) for item in params]
cur.executemany("""\
insert into test (id, max_score)
values (%s, %s) on duplicate key update
id = if(max_score>%s, id, %s),
max_score = if(max_score>%s, max_score, %s)
""", params)
报错:
Traceback (most recent call last):
File "test.py", line 39, in <module>
""", params)
File "/usr/lib/python2.7/dist-packages/MySQLdb/cursors.py", line 218, in executemany
self.errorhandler(self, TypeError, msg)
File "/usr/lib/python2.7/dist-packages/MySQLdb/connections.py", line 36, in defaulterrorhandler
raise errorclass, errorvalue
TypeError: not all arguments converted during string formatting
这个问题只有在executemany时会有。如果逐条插入的话,就没有这个问题。
2. max_allowed_packet
在进行批量写入的时候,发现报错:
Traceback (most recent call last):
File "test
.py", line 38, in <module>
""", params)
File "/usr/lib/python2.7/dist-packages/MySQLdb/cursors.py", line 223, in executemany
r = self._query('\n'.join([query[:p], ',\n'.join(q), query[e:]]))
File "/usr/lib/python2.7/dist-packages/MySQLdb/cursors.py", line 316, in _query
rowcount = self._do_query(q)
File "/usr/lib/python2.7/dist-packages/MySQLdb/cursors.py", line 280, in _do_query
db.query(q)
_mysql_exceptions.OperationalError: (2006, 'MySQL server has gone away')
原因是因为mysql设置了写入的最大数据大小。可以这样查询:
mysql> show variables like "max_allowed_packet";
+--------------------+----------+
| Variable_name | Value |
+--------------------+----------+
| max_allowed_packet | 33554432 |
+--------------------+----------+
1 row in set (0.00 sec)
可以看到设置了大约33M的最大允许的写入数据大小。而我写入的数据大概在100M左右,所以mysql服务器断开了连接。可以修改该值或者分批写入。