Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(DRAFT) Post-processing for H models #287

Draft
wants to merge 42 commits into
base: main
Choose a base branch
from
Draft

Conversation

damonbayer
Copy link
Collaborator

@damonbayer damonbayer commented Jan 8, 2025

Unfortunately, this has become a bit of monster PR, but it will result in a lot more consistency, clarity, and adaptability in the project.

Done

  • Improves formatting of data files - no longer uses confusing names like "Disease" and "Other", instead uses more descriptive names like "observed_ed_visits" and "other_ed_visits" in a tidy format.
  • Removes generation of legacy-formatted data files.
  • Simplifies and improves output of timeseries models to more closely match PyRenew models
  • Introduces group_time_index_to_date for robust PyRenew-index to date conversions.
  • Introduces parse_pyrenew_model_name to extract expected features (h, e, w), based on the model's name
  • Generalizes post-processing to work with models featuring any combination of h and e.
  • Changes the offset argument in model scoring to be 1, rather than 1 / max_visits, since max_visits is less well-defined when working with multiple targets.
  • There is now just a single scored.rds per model_run_dir. Different models and resolutions are indicated by their respective columns in these tables.
  • Fixes a bug in collecting eval data for hospital admissions d10dcfb
  • Simplifies post-processing to remove redundant data generation.
  • Updates all parts of pipeline to work with the new formatted data files

In progress

  • Hubverse tables not yet implemented - they will be implemented as a single giant hubverse table which can later be filtered

To do:

  • Plot collation does not work yet.

Out of Scope

Closes

Closes #308
Closes #296

Copy link

codecov bot commented Jan 8, 2025

Codecov Report

Attention: Patch coverage is 0% with 4 lines in your changes missing coverage. Please review.

Project coverage is 13.48%. Comparing base (82cda4b) to head (0a62d12).

Files with missing lines Patch % Lines
pipelines/prep_data.py 0.00% 4 Missing ⚠️

❗ There is a different number of reports uploaded between BASE (82cda4b) and HEAD (0a62d12). Click for more details.

HEAD has 1 upload less than BASE
Flag BASE (82cda4b) HEAD (0a62d12)
hewr 1 0
Additional details and impacted files
@@             Coverage Diff             @@
##             main     #287       +/-   ##
===========================================
- Coverage   24.45%   13.48%   -10.98%     
===========================================
  Files          22       16        -6     
  Lines        1611     1105      -506     
===========================================
- Hits          394      149      -245     
+ Misses       1217      956      -261     
Flag Coverage Δ
hewr ?
pipelines 0.00% <0.00%> (ø)
pyrenew_hew 30.97% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
1 participant