Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support blob hierarchy operations with a custom delimiter #10192

Draft
wants to merge 1 commit into
base: dev
Choose a base branch
from

Conversation

clairernovotny
Copy link
Contributor

PR Summary

Introduces a delimiter parameter to the AzureStorage class and its methods for blob hierarchy operations.

  • AzureStorage.cs: Added _delimiter field, updated constructor, List, and ListAsync methods to use delimiter.
  • AzureStorageFactory.cs: Added _delimiter field, updated constructor, and Create method to pass delimiter.

Part of https://github.com/NuGet/Engineering/issues/5645

The `AzureStorage` class now supports a delimiter for blob hierarchy operations, allowing it to handle directory-like structures more effectively by skipping directory prefixes. The `AzureStorageFactory` class has been updated to support the new delimiter parameter, ensuring that instances of `AzureStorage` can be created with this new functionality. Minor formatting improvements were made to the code comments.
@clairernovotny clairernovotny requested a review from a team as a code owner September 27, 2024 18:44
@erdembayar
Copy link
Contributor

@clairernovotny
There're some unit tests are failing.

@@ -38,6 +40,7 @@ public AzureStorage(
logger)
{
_path = path;
_delimiter = delimiter;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If no delimiter is passed, the blob listing will be flat, and you won't get hierarchical groupings like folders in the results. Now with delimiter, it simulates hierarchy. Should its behavior change have a boolean flag to enable or disable the behavior?


public AzureStorage(
BlobServiceClient account,
string containerName,
string path,
string delimiter,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that the only delimiter allowed for storage accounts is /, I'd say, there is no point to parameterize it. Having a constant should be enough.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By any chance it would be \ for non-Windows machines? Not sure about actual implementation of Azure storage machines, under the hood it's it's just another VM from my understanding.

@@ -171,8 +174,14 @@ public override IEnumerable<StorageListItem> List(bool getMetadata)
blobTraits |= BlobTraits.Metadata;
}

foreach (BlobHierarchyItem blob in _directory.GetBlobsByHierarchy(traits: blobTraits, prefix: _path))
foreach (BlobHierarchyItem blob in _directory.GetBlobsByHierarchy(traits: blobTraits, delimiter: _delimiter, prefix: _path))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm running some tests and without the delimiter specified, GetBlobsByHierarchy returns a list of blobs having certain prefix, complete, regardless of directory nesting. On storage accounts with hierarchical namespace enabled, it also includes directories themselves.

When delimiter is passed, it only returns immediate children under prefix AND if prefix ends in /. If prefix does not end with /, it returns a single item. In either case it does not return all blobs with a prefix.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest introducing a separate method for listing only immediate children, so there is no ambiguity/confusion as to what it returns.

@erdembayar
Copy link
Contributor

@clairernovotny
Are you actively working this PR? If not could please make it draft PR?

@clairernovotny clairernovotny marked this pull request as draft November 25, 2024 21:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants