Skip to content

total computational and storage needs for running ModGP #35

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
evalieungh opened this issue Feb 4, 2025 · 3 comments
Open

total computational and storage needs for running ModGP #35

evalieungh opened this issue Feb 4, 2025 · 3 comments

Comments

@evalieungh
Copy link
Collaborator

evalieungh commented Feb 4, 2025

Edit: Hijacking this issue as a more general question about the computational cost of running ModGP.

How much disk space is required for running the whole thing? Adding CAPFITOGEN may increase the total somewhat, but I think ModGP and the BioClim data are the heaviest data&space users.


What are the total storage requirements for the bioclimatic data download? I'm trying to make a similar download for testing capfitogen locally, and so far I have got two files for 2m temperature that seem to cover less than a year of data. Each file is ~23GB. I guess there is some further processing of these files to reduce their size and store them as NetCDF instead, so maybe the end result takes up less space.

So for the current example of ModGP with data from 1985--2015, it may be too much to store locally. I can run shorter tests so it's OK so far, but I think we should spell out how storage-hungry this process is. @trossi or @MichalTorma do you know how much space is used (on LUMI or somewhere else?) for these data?

It would be nice to add an example as a warning in the code, e.g. in the ModGP master script where the download happens.

@MichalTorma
Copy link
Collaborator

MichalTorma commented Feb 4, 2025

I managed to get up to 200 GB and I think it's close to the final size. I still had some issues so didn't manage to download it all yet (I'll have to look at it once I have time again)

@evalieungh
Copy link
Collaborator Author

OK so it's prohibitively big in terms of local use for a lot of computers. Can you update once you get the full data set down? Assuming a time span 1985--2015. It would be nice to add a warning that the 'default' time span is XXX GB and approx XX GB per year.

@evalieungh evalieungh changed the title FUN.DownBV data requirements total computational and storage needs for running ModGP Feb 6, 2025
@evalieungh
Copy link
Collaborator Author

Since the project is coming to an end, it would be nice to get an overview of what is required of computational resources & storage to run ModGP (and capfitogen once I get it done...) so it is easier to apply for new resources.

@evalieungh evalieungh mentioned this issue Feb 21, 2025
11 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants