Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update requirements so S3 is an optional package to reduce package bloat. #1086

Open
JGSweets opened this issue Jan 22, 2024 · 1 comment
Open
Assignees
Labels
New Feature A feature addition not currently in the library

Comments

@JGSweets
Copy link
Contributor

Is your feature request related to a problem? Please describe.
Currently, boto3 is installed as a default package in the DataProfiler.
I suggest adding it as an optional package such that it can be installed only if desired to use s3.

Might be beneficial for things like parque as well due to package size.

Describe the outcome you'd like:

pip install dataprofiler  # doesn't install boto3 + req packages for it

pip install 'dataprofiler[s3]'  # installs boto3 + req packages for it

Additional context:
This can limit the size of docker images or lambda jobs requiring DataProfiler.

@JGSweets JGSweets added the New Feature A feature addition not currently in the library label Jan 22, 2024
@taylorfturner
Copy link
Contributor

Thanks @JGSweets!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
New Feature A feature addition not currently in the library
Projects
None yet
Development

No branches or pull requests

5 participants