Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

prefix option of S3FileSystem.find() doesn't work #684

Open
nkwook opened this issue Jan 10, 2023 · 5 comments
Open

prefix option of S3FileSystem.find() doesn't work #684

nkwook opened this issue Jan 10, 2023 · 5 comments

Comments

@nkwook
Copy link

nkwook commented Jan 10, 2023

Hello, im using s3fs in python 3.11

prefix parameter of
https://s3fs.readthedocs.io/en/latest/api.html#s3fs.core.S3FileSystem.find seems not working.

Even the API needs to return filtered file name list, it just returns all filenames on the folder.

Thank you in advance.

@martindurant
Copy link
Member

Please post an example of the call you are making, what you expect and what you get instead. If this is in the form of a test function, even better!

@nkwook
Copy link
Author

nkwook commented Jan 11, 2023

    def get_file_lists(self, prefix):
        self.logger.info(prefix)
        result = []
        try:
            result = self.s3f.find(
                f'{os.environ["S3_BUCKET_NAME"]}', prefix=prefix
            )

        except Exception as e:
            self.logger.error(e)
        return result

I was using this function for filter files with prefix, but however when I call this now, find function returns entire filenames in os.environ["S3_BUCKET_NAME"]. So I need to add following code to achieve my goal.

      # temp code since prefix param not working
      # result = [x for x in result if x.startswith(f'{os.environ["S3_BUCKET_NAME"]}/'+prefix)]

Thank you!

@martindurant
Copy link
Member

@rlamy ; probably a consequence of the directory iterator that the prefix got lost? Perhaps you have a little time to have a look.

@rlamy
Copy link
Contributor

rlamy commented Jan 20, 2023

@martindurant I couldn't reproduce the issue. I also don't really see how the _iterdir() change could affect this.

@martindurant
Copy link
Member

I suggested because iterdict was the last to touch the code affecting this functionality.

@nkwook , do you think you can format your code in the style of one of our tests (i.e., no environment variasbles and having explicitly set up the files you expect to search though) - it should fail against the mock moto server just the same as against AWS. I am also now trying it and am NOT seeing a problem.

In [10]: fs.find("mymdtemp/zarr", prefix="o")
Out[10]: ['mymdtemp/zarr/oi', 'mymdtemp/zarr/oi.ipynb']

In [11]: len(fs.find("mymdtemp/zarr"))
Out[11]: 1291

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants