Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Utilities Pipeline #16

Open
AFg6K7h4fhy2 opened this issue Oct 9, 2024 · 5 comments
Open

Utilities Pipeline #16

AFg6K7h4fhy2 opened this issue Oct 9, 2024 · 5 comments
Assignees
Labels
documentation Improvements or additions to documentation Medium Priority scope Decision-making on the scope of a tool, task, or object.
Milestone

Comments

@AFg6K7h4fhy2
Copy link
Collaborator

Do something akin to the following for forecasttools:

@AFg6K7h4fhy2 AFg6K7h4fhy2 added the enhancement Enhancements to existing features or code-base items. label Oct 9, 2024
@AFg6K7h4fhy2 AFg6K7h4fhy2 self-assigned this Oct 9, 2024
@AFg6K7h4fhy2 AFg6K7h4fhy2 added documentation Improvements or additions to documentation and removed enhancement Enhancements to existing features or code-base items. labels Oct 21, 2024
@AFg6K7h4fhy2
Copy link
Collaborator Author

AFg6K7h4fhy2 commented Oct 22, 2024

NOTE: Clicked edited above to see earlier versions or corrections to the below diagram.

Possible pipeline:

%%{init: {"theme": "neutral", "themeVariables": { "fontFamily": "Iosevka", "fontSize": "25px", "lineColor": "#808b96", "arrowheadColor": "#808b96", "edgeStrokeWidth": "10px", "arrowheadLength": "20px"}}}%%
flowchart TD
    A1[COVID-19 Data _from forecasttools_] --> A4[NumPyro Model]
    A2[Influenza Data _from forecasttools_] --> A4[NumPyro Model]
    A3[External Dataset] --> A4[NumPyro Model]
    A4[NumPyro Model] -->|_arviz.from_numpyro_| A5[Forecast As InferenceData Object wo/ Dates]
    A5[Forecast As InferenceData Object wo/ Dates] -->|_Add Dates To InferenceData_ - done| A6[InferenceData Object w/ Dates]
    A6[InferenceData Object w/ Dates] -->|_Convert To Tidy-Like Dataframe_ - done| A7[Polars Forecast Dataframe w/ Draws]
    A7[Polars Forecast Dataframe w/ Draws] -->|_Convert To Hubverse Formatted Dataframe_ - done| A8[FluSight Submission Dataframe]
    A7[Polars Forecast Dataframe w/ Draws] -->|_Convert To ScoringUtils Formatted Dataframe_ - in progress| A9[ScoringUtils DataFrame]
    A7[Polars Forecast Dataframe w/ Draws] -->|_Save_| A10[Parquet File]
    A8[FluSight Submission Dataframe] -->|_Save_| A11[Parquet File]
    A9[ScoringUtils DataFrame] -->|_Save_| A12[Parquet File]
    A8[FluSight Submission Dataframe] -->|_Convert To ScoringUtils Formatted Dataframe_ - in progress| A9[ScoringUtils DataFrame]
    A12[Parquet File] -->|_Get scores in R_| A13[Forecast Scores]
    A11[Parquet File] -->|_Model Forecast Hypothesis Testing_| A14[Model Comparison Report]

    B1[Pulled Parquet Hubverse Submissions] -->|_Model Forecast Hypothesis Testing_| A14[Model Comparison Report]

    linkStyle default stroke: #808b96
    linkStyle default stroke-width: 2.0px
Loading

@AFg6K7h4fhy2
Copy link
Collaborator Author

@dylanhmorris Would appreciate feedback on this (possibly you including your mental model of the workflow as another diagram). Also, are the arrows visible on your GitHub Appearance? It worked for me on high contrast white background but not on another setting.

@AFg6K7h4fhy2
Copy link
Collaborator Author

I can see how the Convert To ScoringUtils Ready DataFrame can come from some intermediate step involved in Convert To FluSight Submission.

@AFg6K7h4fhy2
Copy link
Collaborator Author

AFg6K7h4fhy2 commented Oct 23, 2024

@SamuelBrand1 Would appreciate a check in on this as well, Sam.

@AFg6K7h4fhy2 AFg6K7h4fhy2 added the scope Decision-making on the scope of a tool, task, or object. label Nov 4, 2024
@AFg6K7h4fhy2
Copy link
Collaborator Author

The author will flesh out this comment more during the Spring [November 11, November 22] and is simply adding what exists here as a placeholder and so as not to lose any writing.

Both comments #16 (comment) and #16 (comment) still stand unaddressed.

Some thoughts: I believe forecasttools-py can come to facilitate aspects of pre- and post-processing in the Real Time Monitoring (hereafter RTM) branch's pipelines. Presently, the utilities offered by forecasttools-py cover narrow needs of the Short Term Forecasts team's workflows. These workflows include formatting NumPyro forecast model output into Hubverse's submission format. At present, pyrenew-hew has utilities for formatting parts of az.InferenceData as being ready for tidy_draws (and spread_draws) in tidybayes and for making use of R's scoringutils. There are changes that can be made in forecasttools to require of the user writing as little post-processing (forecast scoring) code as possible. #36 and #9 exist in this regard.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation Medium Priority scope Decision-making on the scope of a tool, task, or object.
Projects
None yet
Development

No branches or pull requests

1 participant