-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Let the DataPartitionTable be automatically cleanable #14737
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Haven't finished reading all the code yet, I have some thoughts for discussing:
If the default execution interval is every 2 hours, then maybe adding a periodically executed Procedure is more lightweight compared to adding a new thread pool?
for (int retry = 0; retry < 120; retry++) { | ||
boolean partitionTableAutoCleaned = true; | ||
TDataPartitionTableResp resp = client.getDataPartitionTable(req); | ||
if (TSStatusCode.SUCCESS_STATUS.getStatusCode() == resp.getStatus().getCode()) { | ||
Map<String, Map<TSeriesPartitionSlot, Map<TTimePartitionSlot, List<TConsensusGroupId>>>> | ||
dataPartitionTable = resp.getDataPartitionTable(); | ||
for (Map.Entry< | ||
String, | ||
Map<TSeriesPartitionSlot, Map<TTimePartitionSlot, List<TConsensusGroupId>>>> | ||
e1 : dataPartitionTable.entrySet()) { | ||
for (Map.Entry<TSeriesPartitionSlot, Map<TTimePartitionSlot, List<TConsensusGroupId>>> | ||
e2 : e1.getValue().entrySet()) { | ||
if (e2.getValue().size() != 1) { | ||
// The PartitionTable of each database should only contain 1 time partition slot | ||
partitionTableAutoCleaned = false; | ||
break; | ||
} | ||
} | ||
if (!partitionTableAutoCleaned) { | ||
break; | ||
} | ||
} | ||
} | ||
if (partitionTableAutoCleaned) { | ||
return; | ||
} | ||
TimeUnit.SECONDS.sleep(1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Considering using Awaitility to replace the outer for-loop and sleep
- (As Chatgpt suggested :) please checkout whether these two inner loop can be simplified, such as:
partitionTableAutoCleaned = resp.getDataPartitionTable().entrySet().stream()
.flatMap(e1 -> e1.getValue().entrySet().stream())
.allMatch(e2 -> e2.getValue().size() == 1);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An awesome suggestion! The corresponding test codes are simplified significantly!
iotdb-core/node-commons/src/main/java/org/apache/iotdb/commons/concurrent/ThreadName.java
Outdated
Show resolved
Hide resolved
for (String database : databases) { | ||
long subTreeMaxTTL = getTTLManager().getDatabaseMaxTTL(database); | ||
databaseTTLMap.put( | ||
database, Math.max(subTreeMaxTTL, databaseTTLMap.getOrDefault(database, -1L))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add some judgement like "isDatabaseExisted(database) && 0 < ttl && ttl < Long.MAX_VALUE" here to remove overhead?
BTW, If all the databases don't have ttl, we can logically just do this check and find that none of them need to be cleaned up, so there's no need to do a consensus write
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pinpointing this logic enhancement. The judgement is available at PartitionTableAutoCleaner.
public PartitionManager(IManager configManager, PartitionInfo partitionInfo) { | ||
this.configManager = configManager; | ||
this.partitionInfo = partitionInfo; | ||
this.regionMaintainer = | ||
IoTDBThreadPoolFactory.newSingleThreadScheduledExecutor( | ||
ThreadName.CONFIG_NODE_REGION_MAINTAINER.getName()); | ||
this.partitionCleaner = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe try to reuse procedure periodic tasks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. I now employ the PartitionTableAutoCleaner rather than creating an extra thread pool.
...src/test/java/org/apache/iotdb/confignode/it/partition/IoTDBPartitionTableAutoCleanTest.java
Outdated
Show resolved
Hide resolved
...src/test/java/org/apache/iotdb/confignode/it/partition/IoTDBPartitionTableAutoCleanTest.java
Outdated
Show resolved
Hide resolved
Quality Gate passedIssues Measures |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you considered concurrency issues, such as setTTL and Cleaner running simultaneously?
(I think one solution could be to clean up partitions only after they’ve exceeded the TTL by one hour, but you might have solved this in another way :)
Thanks for raising this critical issue. To address your concern, please refer to the
Here, the removing condition is |
A data partition is not necessary to exist when all corresponding data are expired given the pre-configurated TTL. Hence, this PR added a thread to automatically clean these expired data partitions, making the cache size of the DataPartitionTable is always acceptable! Specifically, the main updates include:
Incidentally, the corresponding IT is available at "IoTDBPartitionTableAutoCleanTest."