HDDS-8784. trigger compaction outside of volume check. #6611

guohao-rosicky · 2024-04-30T10:54:37Z

What changes were proposed in this pull request?

Currently RocksDB compaction is triggered for schema v3 RocksDBs (one DB per volume) as part of the volume check. Periodic manual compaction is necessary because import of sst files as part of container replication can lead to many small sst files in RocksDB.

This operation may slow down the volume check and can have unintended consequences if the minimum gap between volume checks is set very high or very low. Ideally compaction should be triggered on its own independently controlled thread.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-8784

How was this patch tested?

UT: org.apache.hadoop.ozone.container.keyvalue.TestKeyValueContainer#testAutoCompactionSmallSstFile

adoroszlai · 2024-05-06T11:20:42Z

Thanks @guohao-rosicky for working on this. It looks like several tests are failing, please check.

https://github.com/guohao-rosicky/ozone/actions/runs/8892159377/job/24417356838#step:5:1252

https://github.com/guohao-rosicky/ozone/actions/runs/8892159377/job/24417354447#step:5:2662

https://github.com/guohao-rosicky/ozone/actions/runs/8892159377/job/24417354447#step:5:2875

https://github.com/guohao-rosicky/ozone/actions/runs/8892159377/job/24417357696#step:5:2144

ChenSammi · 2024-05-14T07:17:32Z

...tainer-service/src/main/java/org/apache/hadoop/ozone/container/ozoneimpl/OzoneContainer.java

+ dbVolumeSet != null ? 1 :
+ dnConf.getAutoCompactionSmallSstFileExecutors(),
+ new ThreadFactoryBuilder().setNameFormat(
+ "RocksDB Compact Thread-%d").build());


datanodeDetails.threadNamePrefix() + "RocksDBCompactionThread-%d"

ChenSammi · 2024-05-14T07:26:58Z

...tainer-service/src/main/java/org/apache/hadoop/ozone/container/ozoneimpl/OzoneContainer.java

+ HddsVolume hddsVolume = (HddsVolume) volume;
+ CompletableFuture.runAsync(hddsVolume::compactDb, compactExecutor);
+ // If set dbVolumeSet only need to execute the compact db once
+ if (dbVolumeSet != null) {


Not fully understand this. For every HddsVolume, there will be one RocksDB need compaction. It's true even when dbVolume is configured.

ChenSammi · 2024-05-14T07:33:58Z

...tainer-service/src/main/java/org/apache/hadoop/ozone/container/ozoneimpl/OzoneContainer.java

 if (SchemaV3.isFinalizedAndEnabled(config)) {
 HddsVolumeUtil.loadAllHddsVolumeDbStore(
 volumeSet, dbVolumeSet, false, LOG);
+ if (dnConf.autoCompactionSmallSstFile()) {
+ this.compactExecutor = Executors.newScheduledThreadPool(
+ dbVolumeSet != null ? 1 :


I guess this thread pool is to accelerate the db compaction. So the thread count matters when there are multiple rocksdb to compact. The rocksdb and HddsVolume has a 1:1 mapping. So evern dbVolumeSet is set, if there are 10 HddsVolume configured, there will be 10 RocksDB to compact. So there I think dbVolumeSet null check is not required.

ChenSammi · 2024-05-14T07:41:58Z

...c/main/java/org/apache/hadoop/ozone/container/common/statemachine/DatanodeConfiguration.java

+ private long autoCompactionSmallSstFileIntervalMinutes =
+ AUTO_COMPACTION_SMALL_SST_FILE_INTERVAL_MINUTES_DEFAULT;
+
+ @Config(key = "rocksdb.auto-compaction-small-sst-file.executors",


"rocksdb.auto-compaction-small-sst-file.executors" ->

"rocksdb.auto-compaction-small-sst-file.threads"

ChenSammi · 2024-05-14T07:46:51Z

...tainer-service/src/main/java/org/apache/hadoop/ozone/container/ozoneimpl/OzoneContainer.java

@@ -121,7 +124,7 @@ public class OzoneContainer {
 private final ReplicationServer replicationServer;
 private DatanodeDetails datanodeDetails;
 private StateContext context;
-
+ private ScheduledExecutorService compactExecutor;


compactExecutor -> dbCompactionExecutorService .

ChenSammi · 2024-05-14T07:53:22Z

...tainer-service/src/main/java/org/apache/hadoop/ozone/container/common/volume/HddsVolume.java

+ public boolean compactDb() {
+ File dbDir = getDbParentDir();
+ File dbFile = new File(dbDir, CONTAINER_DB_NAME);
+ if (dbFile.exists() && dbFile.canRead()) {


There is no need to check exists and canRead, which is covered in check() already. If it's checked here, then you should mark the volume as failure if check failed. This boolean result is not used any where, maybe a void return is better.

ChenSammi · 2024-05-14T08:01:58Z

...c/main/java/org/apache/hadoop/ozone/container/common/statemachine/DatanodeConfiguration.java

+ AUTO_COMPACTION_SMALL_SST_FILE_INTERVAL_MINUTES_DEFAULT;
+
+ @Config(key = "rocksdb.auto-compaction-small-sst-file.executors",
+ defaultValue = "1",


Current RocksDB compaction is run in volume checker, which uses a unlimited thread pool( Executors.newCachedThreadPool). Not sure whether by default 1 thread will have performance impact to DN or not.

HDDS-8784. trigger compaction outside of volume check.

4febb26

adoroszlai marked this pull request as draft May 6, 2024 11:17

guohao-rosicky added 2 commits May 8, 2024 11:15

fix UT

18a05f9

fix UT

41f12bb

guohao-rosicky marked this pull request as ready for review May 8, 2024 06:04

guohao-rosicky requested review from ChenSammi, adoroszlai, kerneltime and errose28 May 8, 2024 12:25

ChenSammi reviewed May 14, 2024

View reviewed changes

guohao-rosicky and others added 2 commits May 29, 2024 17:53

code review

4b33538

Merge branch 'apache:master' into guohao-HDDS-8784-dev

6898d58

guohao-rosicky requested a review from ChenSammi May 29, 2024 10:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HDDS-8784. trigger compaction outside of volume check. #6611

HDDS-8784. trigger compaction outside of volume check. #6611

guohao-rosicky commented Apr 30, 2024 •

edited

adoroszlai commented May 6, 2024

ChenSammi May 14, 2024

ChenSammi May 14, 2024 •

edited

ChenSammi May 14, 2024

ChenSammi May 14, 2024

ChenSammi May 14, 2024

ChenSammi May 14, 2024 •

edited

ChenSammi May 14, 2024

HDDS-8784. trigger compaction outside of volume check. #6611

Are you sure you want to change the base?

HDDS-8784. trigger compaction outside of volume check. #6611

Conversation

guohao-rosicky commented Apr 30, 2024 • edited

What changes were proposed in this pull request?

What is the link to the Apache JIRA

How was this patch tested?

adoroszlai commented May 6, 2024

ChenSammi May 14, 2024

Choose a reason for hiding this comment

ChenSammi May 14, 2024 • edited

Choose a reason for hiding this comment

ChenSammi May 14, 2024

Choose a reason for hiding this comment

ChenSammi May 14, 2024

Choose a reason for hiding this comment

ChenSammi May 14, 2024

Choose a reason for hiding this comment

ChenSammi May 14, 2024 • edited

Choose a reason for hiding this comment

ChenSammi May 14, 2024

Choose a reason for hiding this comment

guohao-rosicky commented Apr 30, 2024 •

edited

ChenSammi May 14, 2024 •

edited

ChenSammi May 14, 2024 •

edited