最近在优化大批量数据插入的性能问题。
项目原来使用的大批量数据插入方法是Mybatis的foreach拼接SQL的方法。
我发现不管改成Mybatis Batch提交或者原生JDBC Batch的方法都不起作用,实际上在插入的时候仍然是一条条记录的插,速度远不如原来Mybatis的foreach拼接SQL的方法。这对于常理来说是非常不科学的。
下面先罗列一下三种插入方式:
public class NotifyRecordDaoTest extends BaseTest {
@Resource(name = "masterDataSource")
private DataSource dataSource;
@Test
public void insert() throws Exception {
Connection connection = dataSource.getConnection();
connection.setAutoCommit(false);
String sql = "insert into notify_record(" +
" partner_no," +
" trade_no, loan_no, notify_times," +
" limit_notify_times, notify_url, notify_type,notify_content," +
" notify_status)" +
" values(?,?,?,?,?,?,?,?,?) ";
PreparedStatement statement = connection.prepareStatement(sql);
for (int i = 0; i < 10000; i++) {
statement.setString(1, "1");
statement.setString(2, i + "");
statement.setInt(3, 1);
statement.setInt(4, 1);
statement.setString(5, "1");
statement.setString(6, "1");
statement.setString(7, "1");
statement.setString(8, "1");
statement.setString(9, "1");
statement.addBatch();
}
long start = System.currentTimeMillis();
statement.executeBatch();
connection.commit();
connection.close();
statement.close();
System.out.println(System.currentTimeMillis() - start);
}
@Test
public void insertB() {
List<NotifyRecordEntity> notifyRecordEntityList = Lists.newArrayList();
for (int i = 0; i < 10000; i++) {
NotifyRecordEntity record = new NotifyRecordEntity();
record.setLastNotifyTime(new Date());
record.setPartnerNo("1");
record.setLimitNotifyTimes(1);
record.setNotifyUrl("1");
record.setLoanNo("1");
record.setNotifyContent("1");
record.setTradeNo("" + i);
record.setNotifyTimes(1);
record.setNotifyType(EnumNotifyType.DAIFU);
record.setNotifyStatus(EnumNotifyStatus.FAIL);
notifyRecordEntityList.add(record);
}
long start = System.currentTimeMillis();
Map<String, Object> params = Maps.newHashMap();
params.put("notifyRecordEntityList", notifyRecordEntityList);
DaoFactory.notifyRecordDao.insertSelectiveList(params);
System.out.println(System.currentTimeMillis() - start);
}
@Resource
SqlSessionFactory sqlSessionFactory;
@Test
public void insertC() {
SqlSession sqlsession = sqlSessionFactory.openSession(ExecutorType.BATCH, false);
NotifyRecordDao notifyRecordDao = sqlsession.getMapper(NotifyRecordDao.class);
int num = 0;
for (int i = 0; i < 10000; i++) {
NotifyRecordEntity record = new NotifyRecordEntity();
record.setLastNotifyTime(new Date());
record.setPartnerNo("1");
record.setLimitNotifyTimes(1);
record.setNotifyUrl("1");
record.setLoanNo("1");
record.setNotifyContent("1");
record.setTradeNo("s" + i);
record.setNotifyTimes(1);
record.setNotifyType(EnumNotifyType.DAIFU);
record.setNotifyStatus(EnumNotifyStatus.FAIL);
notifyRecordDao.insert(record);
num++;
// if(num>=1000){
// sqlsession.commit();
// sqlsession.clearCache();
// num=0;
// }
}
long start = System.currentTimeMillis();
sqlsession.commit();
sqlsession.clearCache();
sqlsession.close();
System.out.println(System.currentTimeMillis() - start);
}
}
测试插入一万条数据的发现除了拼接SQL的方式需要用5秒多的时间外,Mybatis Batch和原生JDBC Batch都需要50多秒,怎么想都觉得不可能,写法没有问题一定是数据库或者数据库连接配置上有问题。
后来才发现要批量执行的话,JDBC连接URL字符串中需要新增一个参数:rewriteBatchedStatements=true
master.jdbc.url=jdbc:mysql://112.126.84.3:3306/outreach_platform?useUnicode=true&characterEncoding=utf8&allowMultiQueries=true&rewriteBatchedStatements=true
关于rewriteBatchedStatements这个参数介绍:
MySQL的JDBC连接的url中要加rewriteBatchedStatements参数,并保证5.1.13以上版本的驱动,才能实现高性能的批量插入。
MySQL JDBC驱动在默认情况下会无视executeBatch()语句,把我们期望批量执行的一组sql语句拆散,一条一条地发给MySQL数据库,批量插入实际上是单条插入,直接造成较低的性能。
只有把rewriteBatchedStatements参数置为true, 驱动才会帮你批量执行SQL
另外这个选项对INSERT/UPDATE/DELETE都有效
添加rewriteBatchedStatements=true这个参数后的执行速度比较:
同个表插入一万条数据时间近似值:
JDBC BATCH 1.1秒左右 > Mybatis BATCH 2.2秒左右 > 拼接SQL 4.5秒左右