Skip to content

Latest commit

 

History

History
226 lines (189 loc) · 10.7 KB

20220702_01.md

File metadata and controls

226 lines (189 loc) · 10.7 KB

spring cloud中如何有效的设置reWriteBatchedInserts提高PG性能

作者

iihero

日期

2022-07-02

标签

Java, spring-boot, spring cloud, PostgreSQL, reWriteBatchedInserts, batchInsert

背景

在使用spring-boot JPA 结合 Hibernate/Eclipse Link的过程中, 你会发现一个有趣的现象,就是使用的batchInsert并不是真正的batch. 详情可以看看: https://jdbc.postgresql.org/documentation/head/connect.html https://vladmihalcea.com/postgresql-multi-row-insert-rewritebatchedinserts-property/

而在使用spring cloud生成data source的过程中,你会痛苦的发现,试了好多种方式,都不能有效的将这个property给设上。目前spring-boot 2.7里头默认的connection pool用的是HiKari,替代以前的tomcat 的DBCP. 性能确实是没话说。可是在设置property方面,确实有些迷惑人。

一、尝试一

直接使用spring cloud 中的PoolConfig etc.

 PoolConfig poolConfig = new PoolConfig(minPoolSize, maxParallelRequests, 5000);
 Map<String, Object> connProperties = new HashMap<>();
 connProperties.put("connectionTimeout", 5000);
 connProperties.put("reWriteBatchedInserts", "true");
 ConnectionConfig connConfig = new ConnectionConfig(
 "reWriteBatchedInserts=true&options=-c%20statement_timeout=2000&ApplicationName=iihero");
 DataSourceConfig dbConfig = new DataSourceConfig(poolConfig, connConfig, null, connProperties);
 DataSource dataSource = connectionFactory().dataSource(dbConfig);

发现上边connConfig中传的参数完全被HiKari忽略。

二、尝试二

在一的基础上,尝试修改HiKari中的Datasource properties。

 PoolConfig poolConfig = new PoolConfig(minPoolSize, maxParallelRequests, 5000);
 Map<String, Object> connProperties = new HashMap<>();
 connProperties.put("connectionTimeout", 5000);
 connProperties.put("reWriteBatchedInserts", "true");
 ConnectionConfig connConfig = new ConnectionConfig(
 "reWriteBatchedInserts=true&options=-c%20statement_timeout=2000&ApplicationName=iihero");
 DataSourceConfig dbConfig = new DataSourceConfig(poolConfig, connConfig, null, connProperties);
 DataSource dataSource = connectionFactory().dataSource(dbConfig);

DelegatingDataSource myDs = (DelegatingDataSource) dataSource;

HikariDataSource ds = (HikariDataSource) myDs.getTargetDataSource();
ds.getDataSourcePropeties().put("reWriteBatchedInserts", "true");

依然无用。

三、尝试三

直接在application.yml文件中定义附加属性

spring:
  datasource:
    hikari:
      data-source-properties:
        rewriteBatchedStatements: true
        ApplicationName: iihero

还是不能用。提示,其实如果是独立的非cloud的应用,这种方式还是可以的。把完整的datasource定义在上边定义出来即可。

四、直接构造HiKari DataSource

	@Bean
	@Primary
	public DataSourceProperties dataSourceProperties(Cloud cloud) {
		DataSourceProperties properties = new DataSourceProperties();
		if (cloud != null) {
			List<ServiceInfo> infos = cloud.getServiceInfos(DataSource.class);
			if (infos.size() >= 1) {
				RelationalServiceInfo info = (RelationalServiceInfo) infos.get(0);
				properties.setUrl(info.getJdbcUrl()
						+ "&reWriteBatchedInserts=true&options=-c%20statement_timeout=2000&ApplicationName=iihero");
				properties.setUsername(info.getUserName());
				properties.setPassword(info.getPassword());
			}
		}
		log.warn("Bean created... {}", properties.getUrl());
		return properties;
	}

	@Bean
	@ConditionalOnBean(DataSourceProperties.class)
	public DataSource dataSource(DataSourceProperties props) {
		HikariConfig config = new HikariConfig();
		config.setConnectionTimeout(5000L);
		config.setMaximumPoolSize(maxParallelRequests);
		config.setJdbcUrl(props.getUrl());
		config.setUsername(props.getUsername());
		config.setPassword(props.getPassword());
		HikariDataSource dataSource = new HikariDataSource(config);
		Properties p = dataSource.getDataSourceProperties();

		for (Object key : p.keySet()) {
			System.err.println(String.format("Hikari datasource Key: " + key + "value: " + p.get(key)));
		}
		return dataSource;
	}

最后发现这种方式是能工作的。
当然,上边也是可以略微调整的。在HiKariConfig的设置当中加上下边的code。true这里用的是String, 不是Boolean。

		Properties dsProperties = new Properties();
		dsProperties.put("reWriteBatchedInserts", "true");
		config.setDataSourceProperties(dsProperties);

五、验证结果

直接连接上目标数据库PostgreSQL. 通过查询pg_stat_activity视图,一目了然

select * from pg_stat_activity where application_name = 'iihero' or application_name like 'DBeaver%'

结果如:

datname pid usesysid usename datid application_name client_addr
63353 c67c0f94ec69 16396 imBCySiKYOka 14411 iihero
63353 c67c0f94ec69 16396 imBCySiKYOka 14406 iihero
63353 c67c0f94ec69 16396 imBCySiKYOka 14400 iihero
63353 c67c0f94ec69 16396 imBCySiKYOka 14398 iihero
63353 c67c0f94ec69 16396 imBCySiKYOka 14397 iihero

其实,如果不指定ApplicationName, 默认的值应该是:

select * from pg_stat_activity -- where application_name = 'iihero' or application_name like 'DBeaver%'

结果如:

datid datname pid usesysid usename application_name
16396 imBCySiKYOka 17117 22785 d2140fe02dcb PostgreSQL JDBC Driver
16396 imBCySiKYOka 3055 63335 f8d4ba83dfea PostgreSQL JDBC Driver
16396 imBCySiKYOka 31084 63347 7b985656a6fc PostgreSQL JDBC Driver
16396 imBCySiKYOka 29489 63335 f8d4ba83dfea PostgreSQL JDBC Driver
16396 imBCySiKYOka 27090 63335 f8d4ba83dfea PostgreSQL JDBC Driver
16396 imBCySiKYOka 29159 63346 77b28d132e14 PostgreSQL JDBC Driver
16396 imBCySiKYOka 32562 63349 24c0bf53822c PostgreSQL JDBC Driver
16396 imBCySiKYOka 741 63335 f8d4ba83dfea PostgreSQL JDBC Driver
16396 imBCySiKYOka 13393 63358 f41e5f1aef8e PostgreSQL JDBC Driver
16396 imBCySiKYOka 16397 22785 d2140fe02dcb PostgreSQL JDBC Driver
16396 imBCySiKYOka 22048 63943 6cc01164ffd7 PostgreSQL JDBC Driver
16396 imBCySiKYOka 3734 63347 7b985656a6fc PostgreSQL JDBC Driver
16396 imBCySiKYOka 21308 63340 ae13e398433e PostgreSQL JDBC Driver
16396 imBCySiKYOka 27439 63349 24c0bf53822c PostgreSQL JDBC Driver
16396 imBCySiKYOka 24047 22785 d2140fe02dcb PostgreSQL JDBC Driver
16396 imBCySiKYOka 29849 63943 6cc01164ffd7 PostgreSQL JDBC Driver
16396 imBCySiKYOka 29850 63346 77b28d132e14 PostgreSQL JDBC Driver
16396 imBCySiKYOka 22961 63357 a45a63c1d363 PostgreSQL JDBC Driver
16396 imBCySiKYOka 21725 22785 d2140fe02dcb PostgreSQL JDBC Driver
16396 imBCySiKYOka 22039 63340 ae13e398433e PostgreSQL JDBC Driver
16396 imBCySiKYOka 24780 63943 6cc01164ffd7 PostgreSQL JDBC Driver
16396 imBCySiKYOka 22414 22785 d2140fe02dcb PostgreSQL JDBC Driver
16396 imBCySiKYOka 3192 22785 d2140fe02dcb PostgreSQL JDBC Driver
16396 imBCySiKYOka 20968 63349 24c0bf53822c PostgreSQL JDBC Driver
16396 imBCySiKYOka 14376 63353 c67c0f94ec69 iihero
16384 rdsadmin 18301 10 rdsadmin PostgreSQL JDBC Driver
16396 imBCySiKYOka 1879 63943 6cc01164ffd7 PostgreSQL JDBC Driver
16396 imBCySiKYOka 31779 63349 24c0bf53822c PostgreSQL JDBC Driver
16396 imBCySiKYOka 13621 63347 7b985656a6fc PostgreSQL JDBC Driver
16396 imBCySiKYOka 24003 63349 24c0bf53822c PostgreSQL JDBC Driver
16396 imBCySiKYOka 14381 63353 c67c0f94ec69 iihero
16396 imBCySiKYOka 1939 63942 2cd9584f4103 PostgreSQL JDBC Driver
16396 imBCySiKYOka 31845 63349 24c0bf53822c PostgreSQL JDBC Driver
16396 imBCySiKYOka 3163 63349 24c0bf53822c PostgreSQL JDBC Driver
16396 imBCySiKYOka 24236 63348 f8aa3288a874 PostgreSQL JDBC Driver
16396 imBCySiKYOka 31876 63349 24c0bf53822c PostgreSQL JDBC Driver
16396 imBCySiKYOka 31824 63349 24c0bf53822c PostgreSQL JDBC Driver
16396 imBCySiKYOka 29516 63943 6cc01164ffd7 PostgreSQL JDBC Driver
16396 imBCySiKYOka 28031 63349 24c0bf53822c PostgreSQL JDBC Driver
16396 imBCySiKYOka 24974 51029 4aa3bb7753c7 DBeaver 21.0.5 - SQLEditor <Script-16.sql>
16396 imBCySiKYOka 14382 63353 c67c0f94ec69 iihero
16396 imBCySiKYOka 31821 63349 24c0bf53822c PostgreSQL JDBC Driver
16396 imBCySiKYOka 31891 63349 24c0bf53822c PostgreSQL JDBC Driver
16396 imBCySiKYOka 1551 63942 2cd9584f4103 PostgreSQL JDBC Driver
16396 imBCySiKYOka 14383 63353 c67c0f94ec69 iihero
16396 imBCySiKYOka 24888 51029 4aa3bb7753c7 DBeaver 21.0.5 - SQLEditor <Script-15.sql>
16396 imBCySiKYOka 26229 22785 d2140fe02dcb PostgreSQL JDBC Driver
16396 imBCySiKYOka 15814 63346 77b28d132e14 PostgreSQL JDBC Driver
16396 imBCySiKYOka 14384 63353 c67c0f94ec69 iihero
16396 imBCySiKYOka 20247 63349 24c0bf53822c PostgreSQL JDBC Driver
16396 imBCySiKYOka 3018 63942 2cd9584f4103 PostgreSQL JDBC Driver
16396 imBCySiKYOka 27441 63349 24c0bf53822c PostgreSQL JDBC Driver
16396 imBCySiKYOka 20204 63349 24c0bf53822c PostgreSQL JDBC Driver
16396 imBCySiKYOka 27478 63349 24c0bf53822c PostgreSQL JDBC Driver
16396 imBCySiKYOka 32016 63348 f8aa3288a874 PostgreSQL JDBC Driver
16396 imBCySiKYOka 16356 63349 24c0bf53822c PostgreSQL JDBC Driver
16396 imBCySiKYOka 20314 63348 f8aa3288a874 PostgreSQL JDBC Driver
16396 imBCySiKYOka 3147 63349 24c0bf53822c PostgreSQL JDBC Driver
16396 imBCySiKYOka 14385 63353 c67c0f94ec69 iihero
16396 imBCySiKYOka 16392 63360 a5ac188f505f PostgreSQL JDBC Driver
16396 imBCySiKYOka 24927 51029 4aa3bb7753c7 DBeaver 21.0.5 - Main
16396 imBCySiKYOka 14386 63353 c67c0f94ec69 iihero
16396 imBCySiKYOka 14387 63353 c67c0f94ec69 iihero

总结

如果spring cloud的库不能完全满足你的需求的时候,必须了解底层的DataSource是如何一步步构建出来的。大不了自己去构建就行。

这个BatchInsert的特性在2016年引入。详见pgjdbc的源码。这里直接附上相应的changelog。

Version 9.4.1209 (2016-07-15)

Notable changes:

  • BUG: json datatype is returned as java.lang.String object, not as PGObject (fixed in 9.4.1211)
  • Many improvements to insert into .. values(?,?) -> insert .. values(?,?), (?,?)... rewriter. Give it a try by using reWriteBatchedInserts=true connection property. 2-3x improvements for insert batch can be expected
  • Full test suite passes against PostgreSQL 9.6, and OpenJDK 9
  • Performance optimization for timestamps (~TimeZone.getDefault optimization)
  • Allow build-from-source on GNU/Linux without maven repositories, and add Fedora Copr test to the regression suite

Flag Counter