Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle large datasets efficienlty #582

Open
dalonsoa opened this issue Oct 9, 2024 · 2 comments
Open

Handle large datasets efficienlty #582

dalonsoa opened this issue Oct 9, 2024 · 2 comments
Assignees
Labels
enhancement New feature or request

Comments

@dalonsoa
Copy link
Collaborator

dalonsoa commented Oct 9, 2024

  • Some models are going to require data at much higher temporal resolution than the wider model update tick. An example here is sub-daily or daily inputs to the Abiotic model.
  • The input data files for this use case can be very large – not something we really want to ingest into the Data object at model startup and try and store in RAM.
  • So, where do we store this kind of data, and is there a way to lazily load the data as required. This might be something that dask is well-suited to as this handles lazy loading of chunked data.
@dalonsoa dalonsoa added the enhancement New feature or request label Oct 9, 2024
@dalonsoa
Copy link
Collaborator Author

dalonsoa commented Oct 9, 2024

@vgro , we will need an example simulation with, at least, one BIG file and some indication to where it is used, so we can explore how to best handle that memory wise.

@alexdewar alexdewar self-assigned this Oct 9, 2024
@alexdewar
Copy link
Collaborator

@vgro Do you happen to have a big file like this lying around? No pressure -- I've got lots to be getting on elsewhere -- but I won't be able to start on this until there's some data for me to work with, so if you do have a chance to look at it over the next few weeks, that'd be great.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants