Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

op.stat("/") would return OK even if the bucket does not exist #4894

Open
1 task
tisonkun opened this issue Jul 13, 2024 · 3 comments
Open
1 task

op.stat("/") would return OK even if the bucket does not exist #4894

tisonkun opened this issue Jul 13, 2024 · 3 comments
Labels
bug Something isn't working

Comments

@tisonkun
Copy link
Member

tisonkun commented Jul 13, 2024

Describe the bug

Configure an S3 compatible backend, said MinIO.

I'd like to use op.stat("/") to ensure that the bucket is ready, or else it should return not found.

Steps to Reproduce

op = /* create operator */ ;
(|| async { op.stat("/").await })
                .retry(&ExponentialBuilder::default())
                .await
                .unwrap();

op.remove_all("").await.unwrap()

stat will return Metadata { metakey: FlagSet(Complete | Mode), mode: DIR, cache_control: None, content_disposition: None, content_length: None, content_md5: None, content_range: None, content_type: None, etag: None, last_modified: None, version: None } but the remove_all will fail with NotFound (permanent) at List::next => S3Error { code: "NoSuchBucket", message: "The specified bucket does not exist" ... }.

Expected Behavior

stat should return NotFound or NotSuchBucket if the bucket does not exist.

Additional Context

Related code exists at CompleteAccessor::complete_stat:

async fn complete_stat(&self, path: &str, args: OpStat) -> Result<RpStat> {
let capability = self.meta.full_capability();
if !capability.stat {
return Err(self.new_unsupported_error(Operation::Stat));
}
if path == "/" {
return Ok(RpStat::new(Metadata::new(EntryMode::DIR)));
}

Are you willing to submit a PR to fix this bug?

  • Yes, I would like to submit a PR.
@Xuanwo
Copy link
Member

Xuanwo commented Jul 14, 2024

The stat("/") function is treated specially because opendal requires the storage backend to ensure that the service is available and that the root is always valid.

We implemented this feature because we noticed that users frequently perform stat("/") before beginning operations, and 99% of these actions are unnecessary. We have a separate check() function for this purpose.

It does surprise the user, though; I will try to document it instead.

@tisonkun
Copy link
Member Author

tisonkun commented Jul 14, 2024

The stat("/") function is treated specially because opendal requires the storage backend to ensure that the service is available and that the root is always valid.

What does it mean? If the operator is misconfigured, such op should fail.

I will try to document it instead.

I have two questions here:

  1. What if we fall through the following code without special checks?
  2. What if we call check for stat("/")?

@Xuanwo
Copy link
Member

Xuanwo commented Jul 15, 2024

There is a history behind using stat("/") and stat("path/to/dir/").

Previously, services like S3 did not support statting a directory; they would always return Ok because object storage services lack a concept of directories. As a workaround, we treated / specially to enable it to return quickly without going through the entire process.

In newer versions, we've improved how we handle stat("path/to/dir/") by verifying the existence of the path on S3 using list("path/to/dir/"). However, listing operations are expensive (10 times more than GET/HEAD) and slow (1/10 as fast as HEAD). Therefore, we have kept the logic for stat("/") unchanged.

Perhaps we can relocate the list-stat simulation logic to S3 and allow users to enable/disable it, ensuring that stat("/") is always sent with a real request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants