iihero
2022-07-02
Java, spring-boot, spring cloud, PostgreSQL, reWriteBatchedInserts, batchInsert
在使用spring-boot JPA 结合 Hibernate/Eclipse Link的过程中, 你会发现一个有趣的现象,就是使用的batchInsert并不是真正的batch. 详情可以看看: https://jdbc.postgresql.org/documentation/head/connect.html https://vladmihalcea.com/postgresql-multi-row-insert-rewritebatchedinserts-property/
而在使用spring cloud生成data source的过程中,你会痛苦的发现,试了好多种方式,都不能有效的将这个property给设上。目前spring-boot 2.7里头默认的connection pool用的是HiKari,替代以前的tomcat 的DBCP. 性能确实是没话说。可是在设置property方面,确实有些迷惑人。
直接使用spring cloud 中的PoolConfig etc.
PoolConfig poolConfig = new PoolConfig(minPoolSize, maxParallelRequests, 5000);
Map<String, Object> connProperties = new HashMap<>();
connProperties.put("connectionTimeout", 5000);
connProperties.put("reWriteBatchedInserts", "true");
ConnectionConfig connConfig = new ConnectionConfig(
"reWriteBatchedInserts=true&options=-c%20statement_timeout=2000&ApplicationName=iihero");
DataSourceConfig dbConfig = new DataSourceConfig(poolConfig, connConfig, null, connProperties);
DataSource dataSource = connectionFactory().dataSource(dbConfig);
发现上边connConfig中传的参数完全被HiKari忽略。
在一的基础上,尝试修改HiKari中的Datasource properties。
PoolConfig poolConfig = new PoolConfig(minPoolSize, maxParallelRequests, 5000);
Map<String, Object> connProperties = new HashMap<>();
connProperties.put("connectionTimeout", 5000);
connProperties.put("reWriteBatchedInserts", "true");
ConnectionConfig connConfig = new ConnectionConfig(
"reWriteBatchedInserts=true&options=-c%20statement_timeout=2000&ApplicationName=iihero");
DataSourceConfig dbConfig = new DataSourceConfig(poolConfig, connConfig, null, connProperties);
DataSource dataSource = connectionFactory().dataSource(dbConfig);
DelegatingDataSource myDs = (DelegatingDataSource) dataSource;
HikariDataSource ds = (HikariDataSource) myDs.getTargetDataSource();
ds.getDataSourcePropeties().put("reWriteBatchedInserts", "true");
依然无用。
直接在application.yml文件中定义附加属性
spring:
datasource:
hikari:
data-source-properties:
rewriteBatchedStatements: true
ApplicationName: iihero
还是不能用。提示,其实如果是独立的非cloud的应用,这种方式还是可以的。把完整的datasource定义在上边定义出来即可。
@Bean
@Primary
public DataSourceProperties dataSourceProperties(Cloud cloud) {
DataSourceProperties properties = new DataSourceProperties();
if (cloud != null) {
List<ServiceInfo> infos = cloud.getServiceInfos(DataSource.class);
if (infos.size() >= 1) {
RelationalServiceInfo info = (RelationalServiceInfo) infos.get(0);
properties.setUrl(info.getJdbcUrl()
+ "&reWriteBatchedInserts=true&options=-c%20statement_timeout=2000&ApplicationName=iihero");
properties.setUsername(info.getUserName());
properties.setPassword(info.getPassword());
}
}
log.warn("Bean created... {}", properties.getUrl());
return properties;
}
@Bean
@ConditionalOnBean(DataSourceProperties.class)
public DataSource dataSource(DataSourceProperties props) {
HikariConfig config = new HikariConfig();
config.setConnectionTimeout(5000L);
config.setMaximumPoolSize(maxParallelRequests);
config.setJdbcUrl(props.getUrl());
config.setUsername(props.getUsername());
config.setPassword(props.getPassword());
HikariDataSource dataSource = new HikariDataSource(config);
Properties p = dataSource.getDataSourceProperties();
for (Object key : p.keySet()) {
System.err.println(String.format("Hikari datasource Key: " + key + "value: " + p.get(key)));
}
return dataSource;
}
最后发现这种方式是能工作的。
当然,上边也是可以略微调整的。在HiKariConfig的设置当中加上下边的code。true这里用的是String, 不是Boolean。
Properties dsProperties = new Properties();
dsProperties.put("reWriteBatchedInserts", "true");
config.setDataSourceProperties(dsProperties);
直接连接上目标数据库PostgreSQL. 通过查询pg_stat_activity视图,一目了然
select * from pg_stat_activity where application_name = 'iihero' or application_name like 'DBeaver%'
结果如:
datname | pid | usesysid | usename | datid | application_name | client_addr |
---|---|---|---|---|---|---|
63353 | c67c0f94ec69 | 16396 | imBCySiKYOka | 14411 | iihero | |
63353 | c67c0f94ec69 | 16396 | imBCySiKYOka | 14406 | iihero | |
63353 | c67c0f94ec69 | 16396 | imBCySiKYOka | 14400 | iihero | |
63353 | c67c0f94ec69 | 16396 | imBCySiKYOka | 14398 | iihero | |
63353 | c67c0f94ec69 | 16396 | imBCySiKYOka | 14397 | iihero |
其实,如果不指定ApplicationName, 默认的值应该是:
select * from pg_stat_activity -- where application_name = 'iihero' or application_name like 'DBeaver%'
结果如:
datid | datname | pid | usesysid | usename | application_name |
---|---|---|---|---|---|
16396 | imBCySiKYOka | 17117 | 22785 | d2140fe02dcb | PostgreSQL JDBC Driver |
16396 | imBCySiKYOka | 3055 | 63335 | f8d4ba83dfea | PostgreSQL JDBC Driver |
16396 | imBCySiKYOka | 31084 | 63347 | 7b985656a6fc | PostgreSQL JDBC Driver |
16396 | imBCySiKYOka | 29489 | 63335 | f8d4ba83dfea | PostgreSQL JDBC Driver |
16396 | imBCySiKYOka | 27090 | 63335 | f8d4ba83dfea | PostgreSQL JDBC Driver |
16396 | imBCySiKYOka | 29159 | 63346 | 77b28d132e14 | PostgreSQL JDBC Driver |
16396 | imBCySiKYOka | 32562 | 63349 | 24c0bf53822c | PostgreSQL JDBC Driver |
16396 | imBCySiKYOka | 741 | 63335 | f8d4ba83dfea | PostgreSQL JDBC Driver |
16396 | imBCySiKYOka | 13393 | 63358 | f41e5f1aef8e | PostgreSQL JDBC Driver |
16396 | imBCySiKYOka | 16397 | 22785 | d2140fe02dcb | PostgreSQL JDBC Driver |
16396 | imBCySiKYOka | 22048 | 63943 | 6cc01164ffd7 | PostgreSQL JDBC Driver |
16396 | imBCySiKYOka | 3734 | 63347 | 7b985656a6fc | PostgreSQL JDBC Driver |
16396 | imBCySiKYOka | 21308 | 63340 | ae13e398433e | PostgreSQL JDBC Driver |
16396 | imBCySiKYOka | 27439 | 63349 | 24c0bf53822c | PostgreSQL JDBC Driver |
16396 | imBCySiKYOka | 24047 | 22785 | d2140fe02dcb | PostgreSQL JDBC Driver |
16396 | imBCySiKYOka | 29849 | 63943 | 6cc01164ffd7 | PostgreSQL JDBC Driver |
16396 | imBCySiKYOka | 29850 | 63346 | 77b28d132e14 | PostgreSQL JDBC Driver |
16396 | imBCySiKYOka | 22961 | 63357 | a45a63c1d363 | PostgreSQL JDBC Driver |
16396 | imBCySiKYOka | 21725 | 22785 | d2140fe02dcb | PostgreSQL JDBC Driver |
16396 | imBCySiKYOka | 22039 | 63340 | ae13e398433e | PostgreSQL JDBC Driver |
16396 | imBCySiKYOka | 24780 | 63943 | 6cc01164ffd7 | PostgreSQL JDBC Driver |
16396 | imBCySiKYOka | 22414 | 22785 | d2140fe02dcb | PostgreSQL JDBC Driver |
16396 | imBCySiKYOka | 3192 | 22785 | d2140fe02dcb | PostgreSQL JDBC Driver |
16396 | imBCySiKYOka | 20968 | 63349 | 24c0bf53822c | PostgreSQL JDBC Driver |
16396 | imBCySiKYOka | 14376 | 63353 | c67c0f94ec69 | iihero |
16384 | rdsadmin | 18301 | 10 | rdsadmin | PostgreSQL JDBC Driver |
16396 | imBCySiKYOka | 1879 | 63943 | 6cc01164ffd7 | PostgreSQL JDBC Driver |
16396 | imBCySiKYOka | 31779 | 63349 | 24c0bf53822c | PostgreSQL JDBC Driver |
16396 | imBCySiKYOka | 13621 | 63347 | 7b985656a6fc | PostgreSQL JDBC Driver |
16396 | imBCySiKYOka | 24003 | 63349 | 24c0bf53822c | PostgreSQL JDBC Driver |
16396 | imBCySiKYOka | 14381 | 63353 | c67c0f94ec69 | iihero |
16396 | imBCySiKYOka | 1939 | 63942 | 2cd9584f4103 | PostgreSQL JDBC Driver |
16396 | imBCySiKYOka | 31845 | 63349 | 24c0bf53822c | PostgreSQL JDBC Driver |
16396 | imBCySiKYOka | 3163 | 63349 | 24c0bf53822c | PostgreSQL JDBC Driver |
16396 | imBCySiKYOka | 24236 | 63348 | f8aa3288a874 | PostgreSQL JDBC Driver |
16396 | imBCySiKYOka | 31876 | 63349 | 24c0bf53822c | PostgreSQL JDBC Driver |
16396 | imBCySiKYOka | 31824 | 63349 | 24c0bf53822c | PostgreSQL JDBC Driver |
16396 | imBCySiKYOka | 29516 | 63943 | 6cc01164ffd7 | PostgreSQL JDBC Driver |
16396 | imBCySiKYOka | 28031 | 63349 | 24c0bf53822c | PostgreSQL JDBC Driver |
16396 | imBCySiKYOka | 24974 | 51029 | 4aa3bb7753c7 | DBeaver 21.0.5 - SQLEditor <Script-16.sql> |
16396 | imBCySiKYOka | 14382 | 63353 | c67c0f94ec69 | iihero |
16396 | imBCySiKYOka | 31821 | 63349 | 24c0bf53822c | PostgreSQL JDBC Driver |
16396 | imBCySiKYOka | 31891 | 63349 | 24c0bf53822c | PostgreSQL JDBC Driver |
16396 | imBCySiKYOka | 1551 | 63942 | 2cd9584f4103 | PostgreSQL JDBC Driver |
16396 | imBCySiKYOka | 14383 | 63353 | c67c0f94ec69 | iihero |
16396 | imBCySiKYOka | 24888 | 51029 | 4aa3bb7753c7 | DBeaver 21.0.5 - SQLEditor <Script-15.sql> |
16396 | imBCySiKYOka | 26229 | 22785 | d2140fe02dcb | PostgreSQL JDBC Driver |
16396 | imBCySiKYOka | 15814 | 63346 | 77b28d132e14 | PostgreSQL JDBC Driver |
16396 | imBCySiKYOka | 14384 | 63353 | c67c0f94ec69 | iihero |
16396 | imBCySiKYOka | 20247 | 63349 | 24c0bf53822c | PostgreSQL JDBC Driver |
16396 | imBCySiKYOka | 3018 | 63942 | 2cd9584f4103 | PostgreSQL JDBC Driver |
16396 | imBCySiKYOka | 27441 | 63349 | 24c0bf53822c | PostgreSQL JDBC Driver |
16396 | imBCySiKYOka | 20204 | 63349 | 24c0bf53822c | PostgreSQL JDBC Driver |
16396 | imBCySiKYOka | 27478 | 63349 | 24c0bf53822c | PostgreSQL JDBC Driver |
16396 | imBCySiKYOka | 32016 | 63348 | f8aa3288a874 | PostgreSQL JDBC Driver |
16396 | imBCySiKYOka | 16356 | 63349 | 24c0bf53822c | PostgreSQL JDBC Driver |
16396 | imBCySiKYOka | 20314 | 63348 | f8aa3288a874 | PostgreSQL JDBC Driver |
16396 | imBCySiKYOka | 3147 | 63349 | 24c0bf53822c | PostgreSQL JDBC Driver |
16396 | imBCySiKYOka | 14385 | 63353 | c67c0f94ec69 | iihero |
16396 | imBCySiKYOka | 16392 | 63360 | a5ac188f505f | PostgreSQL JDBC Driver |
16396 | imBCySiKYOka | 24927 | 51029 | 4aa3bb7753c7 | DBeaver 21.0.5 - Main |
16396 | imBCySiKYOka | 14386 | 63353 | c67c0f94ec69 | iihero |
16396 | imBCySiKYOka | 14387 | 63353 | c67c0f94ec69 | iihero |
如果spring cloud的库不能完全满足你的需求的时候,必须了解底层的DataSource是如何一步步构建出来的。大不了自己去构建就行。
这个BatchInsert的特性在2016年引入。详见pgjdbc的源码。这里直接附上相应的changelog。
Notable changes:
- BUG: json datatype is returned as java.lang.String object, not as PGObject (fixed in 9.4.1211)
- Many improvements to
insert into .. values(?,?) -> insert .. values(?,?), (?,?)...
rewriter. Give it a try by usingreWriteBatchedInserts=true
connection property. 2-3x improvements for insert batch can be expected - Full test suite passes against PostgreSQL 9.6, and OpenJDK 9
- Performance optimization for timestamps (~TimeZone.getDefault optimization)
- Allow build-from-source on GNU/Linux without maven repositories, and add Fedora Copr test to the regression suite