Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(services/gdrive): List shows modified timestamp gdrive #5226

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

erickguan
Copy link
Contributor

@erickguan erickguan commented Oct 22, 2024

Which issue does this PR close?

Part of #4746.

Are there any user-facing changes?

None.

@erickguan erickguan force-pushed the modified-timestamp-gdrive branch 2 times, most recently from 8edbfff to 77cf29b Compare October 22, 2024 18:09
@erickguan erickguan marked this pull request as ready for review October 22, 2024 19:14
@erickguan
Copy link
Contributor Author

erickguan commented Oct 22, 2024

I did some experimentation (patch) with async support for listing operations, which helps reduce runtime as expected. However, extending OpList introduces a "breaking" change. Since OpenDAL has a compatibility package, I’m happy to coordinate with you when you're preparing for a breaking release, to minimize the amount of work on both sides. There's no urgency regarding async support, though.

A few other observations from running the behavior tests (not related to the PR):

  1. The behavior test fails on my local machine but passes in CI. I’m investigating the cause casually.
  2. The UUID random generation in behavior tests creates multiple levels of UUIDs. While this results in an extremely low chance of collision in the CI for OpenDAL's test accounts, I’d like to reduce the number of levels when running against my environment too.

// Return self at the first page.
if ctx.token.is_empty() && !ctx.done {
let path = build_rel_path(&self.core.root, &self.path);
let e = oio::Entry::new(&path, Metadata::new(EntryMode::DIR));
let mut metadata = Metadata::new(EntryMode::DIR);
if stat_file_metadata {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, sorry for not making the Metakey behavior clearer. (I'm working on this.)

Metakey represents a best effort hint and is not processed server-side. The service merely needs to supply the most comprehensive metadata available during listing.

The Operator will call stat as required, for example:

fn poll_next(mut self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Option<Self::Item>> {
// Returns `None` if we have errored.
if self.errored {
return Poll::Ready(None);
}
// Trying to pull more tasks if there are more space.
if self.tasks.has_remaining() {
// Building future if we have a lister available.
if let Some(mut lister) = self.lister.take() {
let fut = async move {
let res = lister.next_dyn().await;
(lister, res)
};
self.fut = Some(Box::pin(fut));
}
if let Some(fut) = self.fut.as_mut() {
if let Poll::Ready((lister, entry)) = fut.as_mut().poll(cx) {
self.lister = Some(lister);
self.fut = None;
match entry {
Ok(Some(oe)) => {
let (path, metadata) = oe.into_entry().into_parts();
if metadata.contains_metakey(self.required_metakey) {
self.tasks
.push_back(StatTask::Known(Some((path, metadata))));
} else {
let acc = self.acc.clone();
let fut = async move {
let res = acc.stat(&path, OpStat::default()).await;
(path, res.map(|rp| rp.into_metadata()))
};
self.tasks.push_back(StatTask::Stating(Box::pin(fut)));
}
}
Ok(None) => {
self.lister = None;
}
Err(err) => {
self.errored = true;
return Poll::Ready(Some(Err(err)));
}
}
}
}
}
// Try to poll tasks
if let Some((path, rp)) = ready!(self.tasks.poll_next_unpin(cx)) {
let metadata = rp?;
return Poll::Ready(Some(Ok(Entry::new(path, metadata))));
}
if self.lister.is_some() || self.fut.is_some() {
Poll::Pending
} else {
Poll::Ready(None)
}
}
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review. I get a weird Metadata instance problem, that Gdrivebackend::stat returns metadata with a timestamp but Lister::poll_next gets a result of Metadata without the timestamp.

I will debug it a bit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants