-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
主键模型持久化索引streamload导入大量数据性能剧烈波动 #45400
Comments
给个测试环境的配置,导入频率,每次导入任务导入多少数据 |
| Error: NULL value in non-nullable column 'L_ORDERKEY'. Row: [NULL, 156267372, 1267403, 1, 32, 42609.9, 0.07, 0.01, 'N', 'O', 1997-09-06, 1997-11-11, 1997-10-03, 'COLLECT COD', 'MAIL', 'ges against the quickly regula', 0], 确认下是不是有任务失败了,lineitem的标准数据集主键应该有NULL的部分数据,PK表主键不支持为NULL |
我是先导入MySQL,然后再从MySQL导入Starrocks,MySQL的主键保证了为非NULL |
Enhancement
Starrocks3.1.11版本,使用DataX 10个并发导入tpch 1000g lineitem表,大概60亿行数据,导入到20亿行开始性能剧烈波动,导入速度从160MB/s 下降到20~90MB/s,一段时间后又会回升到150MB/s。
尝试调整过pindex相关的一些参数(如enable_pindex_read_by_page=true、enable_parallel_get_and_bf=false等),性能有所提升,但是还是会剧烈波动。
也测试过Doris主键模型,比较平稳,一直维持在120MB/s的导入速度。
希望这块能优化一下,实现平稳的数据导入。
The text was updated successfully, but these errors were encountered: