Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use caffeine caches, adds in memory tracking [DRAFT] #86

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

ahmarsuhail
Copy link
Contributor

Issue #, if available:

Description of changes:

DRAFT

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@@ -46,12 +48,14 @@ public S3SeekableInputStreamFactory(
this.parquetMetadataStore = new ParquetMetadataStore(configuration.getLogicalIOConfiguration());
this.objectMetadataStore =
new MetadataStore(objectClient, telemetry, configuration.getPhysicalIOConfiguration());
this.memoryTracker = new MemoryTracker();
this.objectBlobStore =
new BlobStore(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: we probably want to start moving our classes to the builder pattern. These constructors are getting pretty huge

this.blobCache =
Caffeine.newBuilder()
.maximumSize(configuration.getBlobStoreCapacity())
.expireAfterWrite(Duration.ofMillis(configuration.getCacheEvictionTimeMillis()))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we chatted about this - should/can we use both expire after write and expire after access?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, looked into it..so it's not recommend to use both: ben-manes/caffeine#511,

I am thinking of going with expireAfterWrite for now. I actually got a bit stuck with this, did some benchmarking a few times and it was failing. Seems like it's likely to do with cache eviction, so I will need to look into this some more. Will get back to it after my PTO or pass onto someone else from the team.

*/
public class MemoryTracker {

@Getter long memoryUsed;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would guess this class is supposed to be thread safe, right? You want to make this AtomicLong

@@ -76,6 +85,8 @@ public static PhysicalIOConfiguration fromConfiguration(ConnectorConfiguration c
.readAheadBytes(configuration.getLong(READ_AHEAD_BYTES_KEY, DEFAULT_READ_AHEAD_BYTES))
.maxRangeSizeBytes(configuration.getLong(MAX_RANGE_SIZE_BYTES_KEY, DEFAULT_MAX_RANGE_SIZE))
.partSizeBytes(configuration.getLong(PART_SIZE_BYTES_KEY, DEFAULT_PART_SIZE))
.cacheEvictionTimeMillis(
configuration.getInt(CACHE_EVICTION_TIME_KEY, DEFAULT_CACHE_EVICTION_TIME_MILLIS))
.build();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait. Are we using this for both Blob cache and Block cache, with the same config?
Not saying this is wrong, but should these be configured separately?

* @param bytesAllocated bytes allocated
* @return total memory in use
*/
public long incrementMemoryUsed(long bytesAllocated) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: probably want to use more consistent names here, something like trackAllocate trackFree or something?

@@ -7,6 +7,7 @@ private Constants() {}

public static final int ONE_KB = 1024;
public static final int ONE_MB = 1024 * 1024;
public static final int ONE_GB = ONE_MB * ONE_MB;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we use this anywhere?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants