Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI tests take 20 mins #231

Open
peterdudfield opened this issue Dec 10, 2024 · 8 comments · May be fixed by #233
Open

CI tests take 20 mins #231

peterdudfield opened this issue Dec 10, 2024 · 8 comments · May be fixed by #233
Labels
good first issue Good for newcomers

Comments

@peterdudfield
Copy link
Contributor

Is there a way to speed up CI tests?
Locally they are a lot quicker for me?
Investigate what is taking a long time in CI

@peterdudfield peterdudfield added the good first issue Good for newcomers label Dec 10, 2024
@laurenceandrews
Copy link

Happy to take a look at this as my first issue here! Let me know your thoughts/any info I'm missing on the current CI setup as this is my very first glance at the code.

A full pytest runs in around 3:45 for me locally. Doesn't seem like I have permission to re-run somebody else's Github action to test CI speed. I'm assuming if I make my own branch and PR, this would be the easiest way to test CI speed?


CI runners suffer from several things (other than generally limited resources):

1. Cold environment

  • Dependencies, environments, and temp files are re-installed and rebuilt every run (this is probably the main difference maker if it's not something we're already conscious of in the CI setup).

2. Higher network latency

  • CI runners can experience slower external API calls or downloads compared to a fast, stable local network.

3. Restrictions on parallelisation

  • Tests that run in parallel locally (if they do at all) may not be optimally configured to do so in CI.

4. Slower disk I/O

  • File reading/writing and large dataset/temp file creation is often slower on CI runners compared to local SSDs.

Below are some areas we could look at to help tackle the above:

1. Cold Environment

  • Cache Python dependencies using GitHub Actions cache to avoid reinstalling them every run.
  • Explore caching temporary or intermediate files created during tests.

2. Higher Network Latency

  • Replace API or external service calls with mocks where possible to eliminate network delays.
  • Consider using pre-downloaded static datasets or artifacts in place of live network fetches.

3. Restrictions on Parallelisation

  • Look into pytest-xdist to run tests in parallel within the CI environment.
  • Adjust parallelization settings in pytest to find the optimal number of workers for CI’s limited resources.

4. Slower Disk I/O

  • Reduce reliance on larger temporary datasets or files during tests, or mock their creation.
  • Optimize file operations by using memory-efficient formats or in-memory operations (e.g., StringIO instead of actual files).
  • Profile specific tests that rely heavily on I/O to identify bottlenecks and optimize them.

@peterdudfield
Copy link
Contributor Author

Any help would be great

@peterdudfield
Copy link
Contributor Author

Yea, if you fork and run github actions yourself, thats probably the best way to do it

@laurenceandrews
Copy link

Sorry, going very slowly on this with Christmas activities going on.

Upon cloning I immediately ran into some failing tests, so I need to see to that before I can analyse the speed issue.

Any idea if the failing tests are a common/known issue before I look deeper into them?

@peterdudfield
Copy link
Contributor Author

Which tests are failing? You might need to log in to huggingface, but I'm not totaly sure

@laurenceandrews
Copy link

Thanks, here's what I've done to consistently get to this impasse:

  • Clone from scratch
  • Create a venv using Python 3.11.0
  • Run conda install -c conda-forge pyresample and pip install quartz-solar-forecast
  • Run pip install pytest fastapi huggingface_hub
  • Run huggingface_cli login and entered my write access token

I got the following failures, which all seem environment-related. Let me know if there are any specific setup steps you can think of that I might have missed.

test_run_forecast (FileNotFoundError):
Problem: It looks like the test is trying to load model-0.3.0.pkl, but this file is missing from the path open-source-quartz-solar-forecast/venv/Lib/site-packages/quartz_solar_forecast/models/. I only see model-0.4.0.pkl in that folder within venv, but I see model-0.3.0.pkl and model-0.4.0.pkl in root > quartz_solar_forecast > models. Anything I should know about setting up a venv correctly with this project? Quick fix: Manually copying model-0.3.0.pkl into the venv folder fixes this, but I'm not happy with that solution!

test_run_eval and test_get_pv_metadata (EmptyDataError):
Problem: Both tests fail because no columns are found in the metadata.csv file. Again, I'm slightly suspicious that pv.py, which contains get_pv_metadata, sits within the venv, whilst the generated metadata.csv sits in the project's root > data > pv directory.
Potential fix: Venv sync/setup fix or manual download required?

test_get_pv (OSError):
Problem: Similar to above, the test attempts to load a NetCDF file (data/pv/pv.netcdf) that I can see also exists but is empty.
Potential fix: Venv sync/setup fix or manual download required?

test_get_file_path (ValueError):
Problem: The format string used for generating file paths is invalid, due to platform-specificity.
Proper fix: I've updated test_file_path.py to use an f-string for platform-independent formatting.

@aryanbhosale
Copy link
Member

Is there a way to speed up CI tests? Locally they are a lot quicker for me? Investigate what is taking a long time in CI

image
@peterdudfield which one of these?

@peterdudfield
Copy link
Contributor Author

The pytest.yaml is slow in CI. For me they are also quick locally

@aryanbhosale aryanbhosale linked a pull request Jan 3, 2025 that will close this issue
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants