-
Notifications
You must be signed in to change notification settings - Fork 38
Standardise S3 interface for estimate_secondary and update parameter extraction infrastructure #1180
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This commit implements the S3 design standardisation described in inst/dev/design_s3.md for estimate_secondary(), completing the work started in #1144 for estimate_infections(). - Return structure now matches estimate_infections: fit, args, observations - Added S3 class hierarchy: c("epinowfit", "estimate_secondary", class(ret)) - Updated documentation to reflect new accessor-based interface - Added get_samples.estimate_secondary() to extract posterior samples - Added get_predictions.estimate_secondary() to get model predictions - Added summary.estimate_secondary() to summarise results - Updated plot.estimate_secondary() to use get_predictions() accessor - Updated forecast_secondary() to use new accessor methods - Updated all tests for estimate_secondary() to use new structure - Tests now use accessor methods (get_predictions(), get_samples(), summary()) - All existing functionality preserved through accessor methods This is a breaking change. Code that directly accesses the return structure will need to be updated: - Old: result$predictions -> New: get_predictions(result) - Old: result$posterior -> New: get_samples(result) - Old: result$data -> New: result$args The estimate_secondary function now shares a consistent S3 interface with estimate_infections, following the design specification. Note: Using get_predictions() instead of predict() to avoid confusion with standard R predict() methods which typically take newdata for out-of-sample prediction.
#1142) This commit adds get_predictions() for estimate_infections() to provide a consistent interface across both estimate_infections() and estimate_secondary(). ## Changes ### New S3 Method - Added get_predictions.estimate_infections() - extracts predicted reported cases from the model, merging them with observations ### Documentation Updates - Updated get_predictions() generic documentation to cover both model types - Updated estimate_infections() @return documentation to mention get_predictions() alongside get_samples() and summary() ### Tests - Updated test_estimate_infections() helper to test get_predictions() - Verifies it returns expected columns (date, confirm, mean) ## Consistent Interface Now both estimate_infections() and estimate_secondary() share the same accessor pattern: - get_samples() - raw posterior samples - get_predictions() - predicted observable outcomes - summary() - summarized estimates This provides a unified S3 interface as specified in the design document.
… of raw samples Changed summary.estimate_secondary() to call extract_stan_param() which returns summary statistics (mean, sd, median, credible intervals) for all model parameters, rather than forwarding to get_samples() which returns raw posterior samples. This makes the behavior consistent with summary.estimate_infections() and provides a clearer distinction between the accessor methods: - get_samples() - raw posterior samples (individual draws) - summary() - summary statistics of parameters (mean, sd, CrIs) - get_predictions() - predictions merged with observations
…sistency Modified summary.estimate_secondary() to match the interface of summary.estimate_infections(): - Added 'type' argument with options "snapshot" (default) and "parameters" - Default (type = "snapshot") returns only key parameters (delay_params, params, frac_obs) for a concise summary - type = "parameters" returns all parameters - Added 'params' argument to filter specific parameters when type = "parameters" This provides consistent default behavior across both summary methods: - estimate_infections(): defaults to concise snapshot (~6 lines) - estimate_secondary(): now also defaults to concise snapshot (key params only) Users can get full parameter tables with summary(fit, type = "parameters").
…dary The term 'snapshot' is specific to real-time epidemic monitoring in estimate_infections() (showing R, growth rate, doubling time). For estimate_secondary(), we're showing estimated model parameters (delay distributions and scaling factors), so 'estimates' is more appropriate.
Use 'compact' instead of 'estimates' to better distinguish the two modes: - type = 'compact': Shows only key parameters (default, concise view) - type = 'parameters': Shows all parameters or filtered subset The term 'compact' clearly describes the format difference rather than implying semantic distinction between types of estimates.
Add missing NAMESPACE exports for the new S3 interface: - get_samples.estimate_secondary() - get_predictions generic and methods for both estimate_infections and estimate_secondary - summary.estimate_secondary() These methods were defined but not exported, causing S3 dispatch to fail.
Add the @method roxygen tag so roxygen2 correctly generates S3method(summary,estimate_secondary) in NAMESPACE instead of exporting it as a regular function.
- Break long function signatures across multiple lines - Break long data.table assignment across lines - Combine nested if conditions in summary method
Adjust hanging indent to match linter expectations: - get_predictions.estimate_infections: 49 -> 48 spaces - get_predictions.estimate_secondary: 48 -> 47 spaces
forecast_secondary() returns objects with the old structure ($predictions) rather than the new standardized structure ($fit, $args, $observations). Add check to handle both cases: return $predictions directly for forecast objects, extract from fit for fitted estimate_secondary objects.
forecast_secondary() now returns a 'forecast_secondary' object instead of 'estimate_secondary', matching the pattern used for forecast_infections(). Changes: - Change class from estimate_secondary to forecast_secondary - Add get_samples.forecast_secondary() method - Add get_predictions.forecast_secondary() method - Remove backward compatibility hack from get_predictions.estimate_secondary() - Export new S3 methods in NAMESPACE This properly separates fitted objects (estimate_secondary) from forecast objects (forecast_secondary), each with appropriate accessor methods.
forecast_secondary objects now have their own plot.forecast_secondary() method, matching the pattern for forecast_infections. The implementation is identical to plot.estimate_secondary() since both use get_predictions() which works for both classes.
Allow CrIs to be customized when calling get_samples(), matching the old behavior where CrIs could be specified.
get_samples.estimate_secondary() now returns individual MCMC draws rather than summary statistics, consistent with the method's purpose. Tests updated to calculate summary statistics from the raw samples.
Created extract_array_parameter() to standardise extraction of matrix parameters (delay_params, params) into indexed format (e.g., params[1], params[2]). Now used consistently by both estimate_infections (via format_samples_with_dates) and estimate_secondary (via get_samples.estimate_secondary).
Follows same pattern as plot.forecast_infections which delegates to plot.estimate_infections. Reduces code duplication.
Updated get_samples.estimate_secondary() to extract all parameters including sim_secondary using extract_latent_state(). Refactored summary.estimate_secondary() and get_predictions.estimate_secondary() to use get_samples() and calc_summary_measures() instead of extract_stan_param(), following the same pattern as estimate_infections. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
…ys() Simplified and unified the parameter extraction API: - extract_parameters(): extracts all params with named lookups - extract_delays(): extracts all delays with named lookups (ready for future delay ID system similar to param IDs) Both functions are plural, both always use lookups, both extract everything. This makes the API consistent and prepares for implementing delay_id_* variables in Stan (e.g., delay_id_gt, delay_id_reporting, delay_id_truncation). Removed extract_parameter() and extract_array_parameter() as they're no longer needed. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
Update extract_delays() to use the delay ID lookup system similar to extract_parameters(). The function now searches for *_id variables (e.g., delay_id, trunc_id) and uses delay_params_groups to map them to meaningful parameter names.
Update the delay parameter system to use semantic naming similar to the params system. This replaces the generic delay_id, trunc_id, gt_id variables with descriptive delay_id_* variables. Changes: - Rename delay parameters in R code: - gt → generation_time - delay → reporting (in infections context) - delay → secondary (in secondary context) - trunc → truncation - Update create_stan_delays() to create delay_id_* variables with names matching the argument names (e.g., delay_id_generation_time, delay_id_reporting, delay_id_secondary, delay_id_truncation) - Update Stan data files to declare new delay_id_* variables - Update Stan code to use semantic delay names throughout - Update extract_delays() to use the new delay_id_* naming system This creates a more flexible and extensible system that clearly identifies which delay is being used in each context.
- Update check_truncation_length() to use delay_id_truncation - Fix gt_id to delay_id_generation_time in estimation models - Remove duplicate delay_id_generation_time from simulation_delays.stan - Update all test cases to use new semantic delay names
- Create data/infections.stan for delay_id_reporting (infections context) - Create data/simulation_infections.stan for simulate_infections - Remove delay_id_reporting from observation_model.stan (shared file) - This fixes estimate_secondary which doesn't use reporting delays
- Rename delay_id_secondary back to delay_id_reporting - Move delay_id_reporting to shared observation_model.stan - Remove separate infections.stan and simulation_infections.stan files - Update R code and tests to use reporting parameter name - Both infections and secondary models now use same observation model
- Remove unnecessary copy() in plot.estimate_secondary() - Add copy() in get_samples.forecast_* methods to prevent mutation - Add get_predictions.forecast_infections() method - These changes ensure objects are properly protected from mutation
- Implement $.estimate_secondary() to handle old structure access - Maps 'predictions' to get_predictions() - Maps 'posterior' to get_samples() - Maps 'data' to object$observations - All deprecated accessors issue lifecycle warnings - Maintains continuity for existing user code
- Moves delay_id_reporting back to preserve variable ordering - RNG state differs due to renamed variables (delay_id->delay_id_reporting) - Deterministic outputs (infections) unchanged - Stochastic outputs (reported_cases) differ but are valid - This is expected when refactoring Stan data blocks
Update variable references in inst/dev/stan-to-R.R to match the new delay ID naming convention: - gt_id → delay_id_generation_time - delay_id → delay_id_reporting - trunc_id → delay_id_truncation 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
Rename delay_rev_pmf to reporting_rev_pmf in files that use delay_id_reporting to make it clear these are reporting delays, not other types of delays (like generation time delays). Updated files: - inst/stan/estimate_infections.stan - inst/stan/estimate_secondary.stan - inst/stan/simulate_secondary.stan - inst/dev/stan-to-R.R All lines remain under 80 characters. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
Fixed all lines >80 chars that were introduced or modified in this branch: - inst/stan/estimate_infections.stan - inst/stan/estimate_secondary.stan - inst/stan/estimate_truncation.stan - inst/stan/simulate_infections.stan - inst/stan/simulate_secondary.stan 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
In simulate_infections.stan, the code was incorrectly checking and using delay_id_generation_time when convolving infections to reports. This should use delay_id_reporting (the reporting delay), not the generation time delay. This bug caused the simulate_infections snapshot to change because it fundamentally changed the model behavior. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
- Use switch() instead of if/else in estimate_secondary $ operator - Remove explicit return() statements - Fix line length in checks.R - Add parameter to globalVariables 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
|
Important Review skippedAuto incremental reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the WalkthroughThis pull request introduces S3 class structures for Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant estimate_secondary
participant Accessors as get_samples<br/>get_predictions<br/>summary
participant Stan as Stan Fit
User->>estimate_secondary: estimate_secondary(...)
estimate_secondary->>Stan: Stan inference
Stan-->>estimate_secondary: posterior samples
estimate_secondary->>estimate_secondary: Assemble fit, args, observations
estimate_secondary-->>User: S3 object c("epinowfit", "estimate_secondary")
User->>Accessors: get_samples(object)
Accessors->>Stan: Extract raw samples
Accessors-->>User: Formatted data.table
User->>Accessors: get_predictions(object)
Accessors->>Stan: Derive from sim_secondary
Accessors-->>User: Predictions with CrIs
User->>Accessors: summary(object, type="compact")
Accessors->>Accessors: Filter to key parameters
Accessors-->>User: Summary statistics
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Areas requiring extra attention:
Possibly related PRs
Suggested reviewers
Pre-merge checks and finishing touches❌ Failed checks (1 inconclusive)
✅ Passed checks (4 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (8)
R/create.R (1)
688-705: Delay ID mapping looks correct; consider guarding for unnamed delaysThe new
delay_id_*construction is consistent withdelay_types_groups(one ID per non-empty delay type, used as an index intodelay_types_groups), and the concatenation intoretmatches the new Stan data fields.One small robustness improvement: if
create_stan_delays()were ever called with unnamed or partially named arguments,delay_names <- names(delays)will yieldNULLor empty strings and you would silently generate unnamed or oddly nameddelay_id_*entries. If you expect all delays to be named (e.g.generation_time,reporting,truncation), you could add a brief check (e.g.stopifnot(!is.null(delay_names), all(nzchar(delay_names)))) to fail fast on misuse. This is optional but would make errors easier to diagnose.Also applies to: 781-782
man/extract_parameters.Rd (1)
1-19: Consider loosening thesamplesdescription beyondrstan::extract()The documentation for
extract_parameters()is clear, but the argument description currently hard-codessamplesas coming fromrstan::extract(). Given the wider package supports multiple Stan backends, you might want to phrase this more generically (e.g. “a named list of arrays in the format returned byrstan::extract()”) so it remains accurate even if other backends feed into this helper.R/format.R (1)
226-314: Consider adding guard forseeding_time = 0as optional defensive programmingThe code currently assumes
args$seeding_time >= 1. Whilst the estimate_infections code path guarantees this viaget_seeding_time()(which enforcesmax(round(seeding_time), 1)), andformat_samples_with_dates()is only called from that path, adding a guard would improve robustness against future code changes:if (isTRUE(args$seeding_time > 0L)) { reported_dates <- dates[-seq_len(args$seeding_time)] } else { reported_dates <- dates }Alternatively, use
seq_len()with the assumption held, as it handles the boundary more safely than1:nsyntax. The rest of the function is well-structured.R/get.R (2)
265-316: Guarding assumptions in get_samples.estimate_secondaryThe construction of samples via
extract_delays(),extract_parameters(), andextract_latent_state("sim_secondary", ...)is well-aligned with the new parameter/delay extraction infrastructure and yields the desired unified samples table.Two minor robustness points you might consider:
burn_inandobject$observations[(burn_in + 1):.N]$date: at the moment this assumesburn_in < nrow(observations)andburn_in >= 0. This is enforced inestimate_secondary(), but if objects can be constructed or modified elsewhere it may be worth asserting these constraints (or falling back to all dates) here as a defensive check.- If
extract_latent_state("sim_secondary", ...)returnsNULLor if the model genuinely has nosim_secondarygenerated quantity,get_samples()will still work (good), but downstreamget_predictions.estimate_secondary()will silently see nosim_secondaryrows and return an essentially empty prediction table. A short, explicit error or warning in that situation might make debugging easier for users supplying custom Stan models.These are not blockers, just potential hardening.
324-417: Copy predictions for forecast_ objects to match get_samples behaviour*The new
get_predictions()methods for the fitted objects look good: they consistently derive summaries fromreported_cases/sim_secondarysamples and merge them withobject$observations, and the column ordering matches the expectations intest-estimate_secondary.R.For the forecast classes, though:
get_predictions.forecast_infections <- function(object, ...) { object$predictions } get_predictions.forecast_secondary <- function(object, ...) { object$predictions }these currently return the underlying
data.tableby reference, whileget_samples.forecast_infections()/get_samples.forecast_secondary()returndata.table::copy(object$samples). That asymmetry means callers can accidentally mutateobject$predictionsvia the result ofget_predictions(), which is surprising given the copying semantics used elsewhere.Consider aligning these with
get_samples.*:get_predictions.forecast_infections <- function(object, ...) { - object$predictions + data.table::copy(object$predictions) } get_predictions.forecast_secondary <- function(object, ...) { - object$predictions + data.table::copy(object$predictions) }This keeps forecast objects safer from accidental in-place modification, at minimal cost.
tests/testthat/test-estimate_secondary.R (1)
212-241: Consider adding explicit checks for forecast_secondary class and accessorsThe
forecast_secondarytests still focus on list structure and plotting:inc_preds <- forecast_secondary(...) expect_equal(names(inc_preds), c("samples", "forecast", "predictions")) expect_error(plot(inc_preds, ...), NA)Given the PR introduces an explicit
"forecast_secondary"S3 class and dedicatedget_samples()/get_predictions()methods, it would be useful to assert these as well, for example:expect_s3_class(inc_preds, "forecast_secondary") expect_true(is.data.frame(get_samples(inc_preds))) expect_true(is.data.frame(get_predictions(inc_preds)))This would tighten coverage of the new public surface and catch regressions in S3 registration.
NEWS.md (1)
18-24: Clarify wording around the “breaking change” for estimate_secondaryThe NEWS entry states:
Breaking change: The previous return structure with
predictions,posterior, anddataelements is no longer supported. Use the accessor methods instead.In the implementation you’ve added a dedicated
$.estimate_secondarymethod that still servesobject$predictions,object$posterior, andobject$data, albeit with deprecation warnings, to provide a smoother migration path.To avoid confusion for users reading NEWS, consider softening the wording to something like:
Breaking change: The previous return structure with
predictions,posterior, anddataelements is deprecated and will be removed in a future release. Use the accessor methods instead (get_predictions(),get_samples(), andobject$observations).Very minor, but it will better reflect the actual behaviour.
If you care about prose style, you could also vary the repeated “Use …” sentence openings in this bullet block, but that’s purely cosmetic.
R/estimate_secondary.R (1)
752-797: Backward-compatible $.estimate_secondary helper is a good migration aidThe custom
$method:
- Intercepts
$predictions,$posterior, and$data,- Emits
lifecycle::deprecate_warn()messages pointing users toget_predictions(),get_samples(), and$observations, and- Falls back to
NextMethod("$")for all other element names,is a sensible way to keep existing user code running while nudging it towards the new accessors.
Two small suggestions:
- It would be worth adding a focused test that exercises each branch (
$predictions,$posterior,$data, and a “normal” field like$fit) to ensure the method is registered correctly and thatNextMethod("$")still exposes all other list elements as expected.- In the deprecation messages, you might consider spelling out the full recommended call (e.g.
"get_predictions(object)") for absolute clarity.Functionally, though, this is a solid compatibility layer.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (44)
CLAUDE.md(1 hunks)NAMESPACE(4 hunks)NEWS.md(1 hunks)R/checks.R(2 hunks)R/create.R(3 hunks)R/estimate_infections.R(2 hunks)R/estimate_secondary.R(9 hunks)R/estimate_truncation.R(1 hunks)R/extract.R(4 hunks)R/format.R(2 hunks)R/get.R(1 hunks)R/simulate_infections.R(1 hunks)R/simulate_secondary.R(1 hunks)R/stanmodels.R(1 hunks)R/summarise.R(1 hunks)R/utilities.R(1 hunks)inst/dev/stan-to-R.R(3 hunks)inst/stan/data/observation_model.stan(1 hunks)inst/stan/data/rt.stan(1 hunks)inst/stan/data/simulation_delays.stan(1 hunks)inst/stan/data/simulation_observation_model.stan(1 hunks)inst/stan/data/simulation_rt.stan(1 hunks)inst/stan/estimate_infections.stan(5 hunks)inst/stan/estimate_secondary.stan(2 hunks)inst/stan/estimate_truncation.stan(4 hunks)inst/stan/simulate_infections.stan(2 hunks)inst/stan/simulate_secondary.stan(2 hunks)man/cash-.estimate_secondary.Rd(1 hunks)man/estimate_infections.Rd(1 hunks)man/estimate_secondary.Rd(1 hunks)man/extract_delays.Rd(1 hunks)man/extract_latent_state.Rd(1 hunks)man/extract_parameter.Rd(0 hunks)man/extract_parameter_samples.Rd(1 hunks)man/extract_parameters.Rd(1 hunks)man/extract_stan_param.Rd(1 hunks)man/get_predictions.Rd(1 hunks)man/get_samples.Rd(1 hunks)man/plot.forecast_secondary.Rd(1 hunks)man/summary.estimate_secondary.Rd(1 hunks)tests/testthat/test-checks.R(5 hunks)tests/testthat/test-delays.R(1 hunks)tests/testthat/test-estimate_infections.R(1 hunks)tests/testthat/test-estimate_secondary.R(8 hunks)
💤 Files with no reviewable changes (1)
- man/extract_parameter.Rd
🧰 Additional context used
🪛 LanguageTool
CLAUDE.md
[style] ~39-~39: Would you like to use the Oxford spelling “Standardize”? The spelling ‘Standardise’ is also correct.
Context: ...does, never reference issues - Good: "Standardise return structure for estimate_secondary...
(OXFORD_SPELLING_Z_NOT_S)
[style] ~40-~40: Would you like to use the Oxford spelling “Standardize”? The spelling ‘Standardise’ is also correct.
Context: ...cture for estimate_secondary" - Bad: "Standardise return structure for estimate_secondary...
(OXFORD_SPELLING_Z_NOT_S)
[uncategorized] ~48-~48: The preposition ‘to’ seems more likely in this position.
Context: ...ents - Follow the pull request template in .github/PULL_REQUEST_TEMPLATE.md - Co...
(AI_HYDRA_LEO_REPLACE_IN_TO)
NEWS.md
[style] ~21-~21: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...intervals merged with observations. - Use summary(object) to get summarised par...
(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
[style] ~21-~21: Would you like to use the Oxford spelling “summarized”? The spelling ‘summarised’ is also correct.
Context: ...tions. - Use summary(object) to get summarised parameter estimates. Use `type = "compa...
(OXFORD_SPELLING_Z_NOT_S)
[style] ~21-~21: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ... to get summarised parameter estimates. Use type = "compact" for key parameters o...
(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
- GitHub Check: build (ubuntu-latest, release, https://packagemanager.rstudio.com/cran/__linux__/noble/latest)
- GitHub Check: ubuntu-latest (release)
- GitHub Check: ubuntu-latest (oldrel-1)
- GitHub Check: ubuntu-latest (devel)
- GitHub Check: windows-latest (release)
- GitHub Check: macOS-latest (release)
- GitHub Check: test-coverage
- GitHub Check: stan-model-benchmark
- GitHub Check: R-CMD-as-cran-check
- GitHub Check: synthetic-validation
- GitHub Check: lint-changed-files
- GitHub Check: pkgdown
🔇 Additional comments (58)
man/extract_latent_state.Rd (1)
5-5: Verify that the title case change aligns with project documentation conventions.The title has been changed from title case ("Extract Samples for a Latent State from a Stan model") to sentence case ("Extract samples for a latent state from a Stan model"). Whilst this is a minor cosmetic change with no functional impact, please confirm that this capitalization shift is:
- Part of an intentional documentation style standardization across the Rd files
- Aligned with the project's Rd documentation conventions
If this change is incidental and not part of a broader documentation style update, consider reverting it to maintain consistency with established Rd documentation practices.
R/utilities.R (1)
509-510: LGTM!The addition of "parameter" to
globalVariables()is appropriate and follows standard R package conventions for preventing CMD check warnings about non-standard evaluation usage, particularly in data.table operations within the new parameter extraction infrastructure.man/extract_parameter_samples.Rd (1)
5-5: LGTM! Documentation style updated.The title casing has been correctly updated to sentence case, aligning with the project's style guidelines.
man/extract_stan_param.Rd (1)
5-5: LGTM! Documentation style updated.The title casing has been correctly updated to sentence case, consistent with the project's style guidelines.
CLAUDE.md (1)
29-51: LGTM! Helpful workflow documentation added.The new sections provide clear guidance on the PR workflow, commit message conventions, and NEWS.md practices. The content aligns well with the project's documented preferences.
R/stanmodels.R (1)
17-30: LGTM! Formatting improvements.The reformatting improves readability without any functional changes.
R/extract.R (2)
60-103: LGTM! Well-structured multi-parameter extraction.The function correctly builds a reverse lookup from
param_id_*variables and handles both named and unnamed parameters with appropriate fallbacks.
116-171: LGTM! Robust delay parameter extraction.The function correctly uses the delay ID lookup system with
delay_types_groupsto extract and name delay parameters. The boundary checks are appropriate for a groups array whereid + 1is accessed.tests/testthat/test-estimate_infections.R (1)
33-38: LGTM! Good test coverage for the new accessor.The test correctly validates that
get_predictions()returns predictions with the expected structure and required columns.R/simulate_secondary.R (1)
73-74: LGTM! Parameter naming standardised.The parameter rename from
delaytoreportingaligns with the broader naming standardisation across the codebase.R/estimate_truncation.R (1)
187-187: LGTM! Parameter naming standardised.The parameter rename from
trunctotruncationaligns with the broader naming standardisation across the codebase, improving consistency.inst/stan/data/rt.stan (1)
8-8: Rename todelay_id_generation_timeis consistent with new delay interfaceThe rename to
delay_id_generation_timewithint<lower = 0>cleanly reflects the generation-time delay identifier and matches the R-sidecreate_stan_delays()output. No issues from a Stan data or indexing perspective.inst/stan/data/simulation_observation_model.stan (1)
6-6: Truncation delay ID rename aligns with new delay semantics
delay_id_truncationwith lower bound 0 is a clear, semantically accurate replacement for the previous truncation ID, and fits the new scheme alongsidedelay_id_generation_timeanddelay_id_reporting.R/estimate_infections.R (2)
76-78: Return-value documentation matches the implemented interfaceThe updated description of the
<estimate_infections>object (fit,args,observations) and the recommendation to usesummary(),get_samples(), andget_predictions()now matches the actual return structure at the bottom of the function and the accessor-based workflow introduced elsewhere in the package.
268-271: Renamed delay arguments correctly wired intocreate_stan_delays()Passing
generation_time = generation_time,reporting = delays, andtruncation = truncationintocreate_stan_delays()is consistent with the newdelay_id_generation_time,delay_id_reporting, anddelay_id_truncationfields produced on the R side and consumed by the Stan data blocks. Thetime_pointsexpression is unchanged and remains appropriate for weighting delay priors.inst/stan/data/simulation_delays.stan (1)
28-28:delay_id_reportingimproves clarity for reporting delaysRenaming the generic delay ID to
delay_id_reportingmakes it explicit that this identifier refers to the reporting-delay component, which is helpful now that generation-time and truncation delays have their own IDs. The type and bounds remain appropriate.man/estimate_infections.Rd (1)
107-111: Updated Rd value section is consistent with the function’s APIThe Rd now correctly describes the
<estimate_infections>object fields and points users towardssummary(),get_samples(), andget_predictions(), matching the R documentation and implementation.inst/stan/data/simulation_rt.stan (1)
8-8: Simulation RT data now usesdelay_id_generation_timeconsistentlyUsing
delay_id_generation_timehere keeps the simulation RT data block aligned with the main RT data (inst/stan/data/rt.stan) and with the R-sidecreate_stan_delays()output. No issues spotted.R/simulate_infections.R (1)
137-141: LGTM! Semantic parameter names improve clarity.The updated parameter names (
generation_time,reporting,truncation) are more descriptive than the previous abbreviated versions and align with the broader refactoring across the codebase.inst/stan/estimate_truncation.stan (2)
17-17: LGTM! Semantic delay ID naming enhances clarity.The introduction of
delay_id_truncationas a named constant improves code readability and aligns with the semantic delay ID system implemented across the codebase.
29-29: Consistent use of delay_id_truncation throughout the model.All references to the truncation delay ID have been updated consistently, including array dimensions, PMF retrieval, and loop bounds.
Also applies to: 41-49, 82-84, 93-93
man/extract_delays.Rd (1)
1-22: LGTM! Clear documentation for internal extraction function.The documentation accurately describes the
extract_delays()function's purpose, usage, and return value. Properly marked as internal with the\keyword{internal}tag.R/checks.R (1)
202-219: LGTM! Truncation checks updated consistently.The update from
trunc_idtodelay_id_truncationis applied consistently across null checks and index calculations, preserving the original logic whilst aligning with the new semantic naming convention.tests/testthat/test-checks.R (2)
210-210: LGTM! Test data updated for new delay identifier.The test data correctly uses
delay_id_truncationinstead oftrunc_id, maintaining test coverage for the renamed identifier.Also applies to: 286-286
231-243: LGTM! Test calls updated with semantic parameter names.The
create_stan_delays()calls now use the new semantic parameter names (generation_time,reporting,truncation), ensuring tests validate the updated API.Also applies to: 264-268
man/get_predictions.Rd (1)
1-45: LGTM! Comprehensive documentation for new accessor method.The documentation clearly describes the
get_predictions()S3 generic and its methods across multiple model classes. The return value, usage, and examples are well documented.inst/stan/simulate_secondary.stan (2)
56-67: LGTM! Reporting delay logic updated with semantic identifiers.The update from
delay_idtodelay_id_reportingand the use ofreporting_rev_pmfas the variable name improve code clarity. The convolution logic remains correct.
81-92: LGTM! Truncation logic updated with semantic identifiers.The update from
trunc_idtodelay_id_truncationand the use oftrunc_rev_cmfas the variable name maintain consistency with the broader refactoring whilst preserving the truncation logic.man/estimate_secondary.Rd (1)
94-98: LGTM! Documentation updated for new S3 interface.The return value documentation now accurately reflects the standardised object structure (
fit,args,observations) and the new accessor methods (summary(),get_samples(),get_predictions()), aligning with the PR objectives.inst/stan/data/observation_model.stan (1)
8-9: LGTM! Semantic delay ID naming improves clarity.The renaming from
trunc_idtodelay_id_truncationanddelay_idtodelay_id_reportingaligns well with the semantic naming convention introduced across the codebase. This enhances readability and makes the purpose of each identifier explicit.tests/testthat/test-delays.R (6)
97-113: LGTM! Comprehensive test coverage for delay ID creation.The test correctly verifies that
create_stan_delaysproduces the expecteddelay_id_*variables with sequential IDs when all delay types are provided. The assertions properly check both the presence of the variables and their values.
115-127: LGTM! Correctly tests secondary model delay ID assignment.The test appropriately verifies that when only a subset of delays is provided (reporting and truncation for secondary models), the IDs are still assigned sequentially starting from 1.
129-139: LGTM! Properly validates behaviour for missing delays.The test correctly verifies that when only generation_time is provided, the corresponding
delay_id_generation_timeis created whilstdelay_id_reportinganddelay_id_truncationare absent from the data structure.
141-160: LGTM! Validates extract_delays behaviour with new naming system.The test effectively verifies that
extract_delayscorrectly processes the newdelay_id_*variables and produces properly structured output with semantic parameter names (e.g.,generation_time[1],generation_time[2]).
162-166: LGTM! Appropriate edge case handling.The test correctly verifies that
extract_delaysreturnsNULLwhendelay_paramsare absent, ensuring graceful handling of incomplete data structures.
168-179: LGTM! Ensures backward compatibility with fallback naming.The test appropriately validates that
extract_delaysgracefully handles cases wheredelay_id_*variables are absent by falling back to indexed naming (delay_params[...]), maintaining backward compatibility.man/plot.forecast_secondary.Rd (1)
1-28: LGTM! Clear documentation for the new plotting method.The documentation for
plot.forecast_secondaryis well-structured and provides appropriate guidance for users. The parameter descriptions are clear, and the lifecycle badge appropriately marks this as stable functionality.man/get_samples.Rd (1)
7-8: LGTM! Consistent extension of accessor methods.The addition of
get_samplesmethods forestimate_secondaryandforecast_secondaryproperly aligns with the existing pattern for infection models, providing a consistent interface across the package.Also applies to: 17-19
inst/dev/stan-to-R.R (3)
120-125: LGTM! Consistent delay ID rename for generation time.The update to use
delay_id_generation_time(previouslygt_id) maintains the existing logic whilst adopting the new semantic naming convention.
149-155: LGTM! Reporting delay ID consistently updated.The rename from
delay_idtodelay_id_reportingcorrectly applies the semantic naming pattern to the reporting delay path inconvolve_to_report.
166-172: LGTM! Truncation delay ID properly renamed.The conditional check and subsequent
get_delay_rev_pmfcall correctly usedelay_id_truncation(previouslytrunc_id), maintaining consistency with the new naming scheme.man/cash-.estimate_secondary.Rd (1)
1-27: LGTM! Clear deprecation guidance for backward compatibility.The documentation effectively communicates the deprecation of the old structure and provides explicit guidance for migrating to the new accessor methods (
get_predictions(),get_samples(), and direct$observationsaccess).R/summarise.R (1)
889-945: LGTM! Well-structured summary method for estimate_secondary.The implementation properly:
- Retrieves samples via
get_samples(object)- Filters to non-time-varying parameters using
is.na(date)- Applies appropriate summary statistics with configurable credible intervals
- Supports both compact and filtered parameter views
The regex filtering on line 938 correctly matches key parameter patterns, providing sensible defaults for the compact view.
man/summary.estimate_secondary.Rd (1)
1-40: LGTM! Comprehensive documentation for summary method.The documentation clearly describes the
summary.estimate_secondarymethod's behaviour, including the distinction between "compact" and "parameters" types, and provides appropriate guidance for the return value structure.R/format.R (1)
194-209: Bulk parameter extraction and naming look consistent with new infrastructureUsing
extract_parameters(samples)once and then looping overunique(all_params$parameter)is a sensible consolidation. Theswitch()mapping of"dispersion"→"reporting_overdispersion"and"frac_obs"→"fraction_observed"should preserve backwards‑compatible semantics while standardising names, and thenrow(result) > 0guard avoids inserting empty components.I would just suggest double‑checking that downstream consumers expect these new list element names (rather than the raw Stan parameter names) everywhere this helper is used, to avoid subtle regressions.
inst/stan/estimate_secondary.stan (1)
59-68: Reporting delay refactor matches the new delay ID schemeThe new
delay_id_reportingpath, withvector[delay_type_max[delay_id_reporting] + 1] reporting_rev_pmfand a matchingget_delay_rev_pmf(..., delay_type_max[delay_id_reporting] + 1, ...)feedingconvolve_to_report(scaled, reporting_rev_pmf, 0), is consistent with the updated delay‑ID design and mirrors the patterns in the infections models.This looks correct and should help avoid the “wrong delay id in observation model” bug you’re fixing.
NAMESPACE (1)
4-22: S3 registrations and exports are aligned with the new interfaceThe added
S3method()entries forget_predictions,get_samples,summary.estimate_secondary,plot.forecast_secondary, and the$method forestimate_secondary, together withexport(get_predictions), look consistent with the PR’s S3 design (mirroringestimate_infections/forecast_infections).Assuming the corresponding generics and methods exist in the roxygen‑annotated R sources, this NAMESPACE wiring should behave as intended and survive re‑generation.
Also applies to: 34-35, 42-43, 91-93
inst/stan/simulate_infections.stan (1)
60-128: Delay ID refactor in simulation model is coherent and dimensionally consistentThe simulation model’s switch to
delay_id_generation_time,delay_id_reporting, anddelay_id_truncationlooks internally consistent:
gt_rev_pmf,reporting_rev_pmf, andtrunc_rev_cmfare all declared withdelay_type_max[delay_id_*] + 1and passed the same size intoget_delay_rev_pmf, avoiding dimension mismatches.- The order of operations (use
gt_rev_pmfingenerate_infections, then convolve withreporting_rev_pmf, optionally truncate withtrunc_rev_cmf, then applyscale_obsifobs_scale) is clear and mirrors the estimation model.I don’t see any obvious correctness issues in this block.
inst/stan/estimate_infections.stan (1)
78-107: Generation‑time, reporting, and truncation delays now consistently use semantic delay IDsThe changes here look internally coherent:
gt_rev_pmfis now keyed offdelay_id_generation_timeand sized asestimate_r * (delay_type_max[delay_id_generation_time] + 1), with the correspondingget_delay_rev_pmf()call usingdelay_type_max[delay_id_generation_time] + 1as itsmax_kargument.- Reporting and truncation both use their respective IDs (
delay_id_reporting,delay_id_truncation) withdelay_type_max[delay_id_*] + 1in both the declaration and theget_delay_rev_pmf()call, then feed the resulting kernels intoconvolve_to_report()andtruncate_obs()exactly as in the simulation model.- In the generated‑quantities block,
gt_rev_pmf_for_growthandsampled_gt_rev_pmfalso consistently usedelay_type_max[delay_id_generation_time] + 1, which should keepcalculate_Rt()andcalculate_growth()aligned with the updated generation‑time specification.Given the subtle size arithmetic, it would be worth double‑checking that
get_delay_rev_pmf()’s return shape for the generation‑time ID still matches theestimate_r * (...)layout expected bygenerate_infections(), but structurally these changes look sound.Also applies to: 139-152, 175-188, 268-283
R/get.R (1)
245-261: get_samples.estimate_infections implementation looks consistent with existing patternsThe method cleanly reuses
extract_samples()plusformat_samples_with_dates()and back-fills missingargsentries; this is in line with the new standardised interface and should give a coherentdate/variable/sample/valuelayout without altering existing semantics. I don’t see issues here.tests/testthat/test-estimate_secondary.R (3)
38-92: Tests correctly exercise new accessors and S3 shape for estimate_secondarySwitching to
inc_posterior <- get_samples(inc)[variable %in% params] prev_posterior <- get_samples(prev)[variable %in% params] predictions <- get_predictions(inc) posterior <- get_samples(inc)and asserting
expect_equal(names(inc), c("fit", "args", "observations")) expect_s3_class(inc, "estimate_secondary") expect_s3_class(inc, "epinowfit") expect_equal( names(predictions), c( "date", "primary", "secondary", "accumulate", "median", "mean", "sd", "lower_90", "lower_50", "lower_20", "upper_20", "upper_50", "upper_90" ) )is a good, direct check that the new API (
fit/args/observations+ accessors) behaves as intended.No issues from a testing perspective here.
156-182: Parameter recovery tests are well-adapted to get_samplesThe updated recovery tests now compute summaries directly from
get_samples():inc_summary <- inc_posterior[, .( mean = mean(value), median = stats::median(value) ), by = variable](and similarly for
prevand the cmdstanr backend), then compare these to the known simulated parameters.This approach cleanly validates the new extraction pipeline (via
extract_parameters()under the hood) under both backends and doesn’t hard-code any assumptions about the internal list structure of the fit. Looks solid.Also applies to: 194-208
258-294: Nice alignment of deprecated preprocessing tests with the new return structureThe tests for
filter_leading_zerosandzero_thresholdnow only check that:
- The output object remains of class
"estimate_secondary", and- It exposes the standardised structure:
expect_named(out, c("fit", "args", "observations")) expect_equal(get_predictions(out)$primary, modified_data$primary)This keeps validation focused on observable behaviour while respecting the new
fit/args/observationspattern. No issues here.R/estimate_secondary.R (5)
59-63: Return-value documentation matches the new S3 structureThe updated
@returndescription:
- Describes an
<estimate_secondary>object withfit,args, andobservations, and- Points users to
summary(),get_samples(), andget_predictions(),is consistent with the actual construction of
retlater in the function and with the expectations in the tests. This is clear and accurate.
235-238: Named arguments to create_stan_delays align with the new delay ID semanticsSwitching the delay setup to:
stan_data <- c(stan_data, create_stan_delays( reporting = delays, truncation = truncation, time_points = stan_data$t ))brings
estimate_secondary()in line with the semanticreporting/truncationinterface used elsewhere and should help avoid the kind of delay-ID mismatch that motivated this PR. This looks correct and keeps the Stan data more self-documenting.
269-277: Standardised epinowfit/estimate_secondary return structure looks goodThe new return block:
ret <- list( fit = fit, args = stan_data, observations = reports ) class(ret) <- c("epinowfit", "estimate_secondary", class(ret))achieves the PR goal of aligning
estimate_secondary()withestimate_infections():
fitcarries the Stan fit for diagnostics,argsexposes the full Stan data list, andobservationscontains the cleaned report data.Tests confirm the expected
names(inc)and S3 classes, so this structure is coherent and forward-compatible with the new accessors.
373-421: plot.estimate_secondary correctly switches to accessor-based predictionsHaving
plot.estimate_secondary()start with:predictions <- get_predictions(x)instead of relying on a
$predictionsfield ensures that:
- The plot always uses the canonical prediction computation, and
- The same code can be reused for
forecast_secondaryobjects via their ownget_predictions()method.The rest of the function remains unchanged and still respects
primary,from/to, and optionalnew_obs. This refactor is clean and keeps plotting aligned with the new accessor-based API.
424-439: plot.forecast_secondary reuse is simple and effectiveThe new method:
plot.forecast_secondary <- function(x, primary = FALSE, from = NULL, to = NULL, new_obs = NULL, ...) { plot.estimate_secondary( x, primary = primary, from = from, to = to, new_obs = new_obs, ... ) }nicely reuses the existing plotting logic while dispatching through the correct
get_predictions()method forforecast_secondaryobjects. This keeps the visual behaviour consistent between fitted and forecast objects without duplicating code.
- Update NEWS.md to clarify backward compatibility is deprecated, not removed - Add data.table::copy() to forecast get_predictions() methods for consistency with get_samples() 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
|
This is how benchmark results would change (along with a 95% confidence interval in relative change) if 8069e53 is merged into main: |
|
stan model benchmark failure is expected - this is still struggling with changes in variable names in the stan code. |
|
This is how benchmark results would change (along with a 95% confidence interval in relative change) if 3293941 is merged into main: |
Description
This PR closes #1142.
This PR completes the S3 standardisation for
estimate_secondary(). Doing so necessitated an update to the infrastructure for extracting delays and parameters.Changes by section
1. S3 interface standardisation (commits 141dcd1..3f35449)
estimate_secondary()return structure to matchestimate_infections()withfit,args, andobservationselementsget_samples(),get_predictions(),summary(), andplot()forecast_secondaryS3 class with dedicated methods$operator for the previous return structure2. Parameter extraction infrastructure (commits 134691c..5271cf1)
extract_parameters()for systematic extraction of all model parametersextract_delays()with delay name lookup systemestimate_secondary()to use modern extraction pattern3. Delay ID naming system (commits 180fe5e..c684a8b)
delay_id_*variables with semantic names (e.g.,delay_id_reporting,delay_id_incubation)Note on scope
Whilst the delay ID naming system (
delay_id_reporting, etc.) may appear to extend beyond S3 standardisation, it is a necessary prerequisite. The S3 interface requiresget_samples()to return interpretable delay parameters. Without semantic delay IDs, delays can only be extracted asdelay_params[1],delay_params[2], etc., making it impossible for users to know which delay is which. Semantic IDs enableextract_delays()to return delays with meaningful names that users can interpret and use.Initial submission checklist
devtools::test()anddevtools::check()).devtools::document()).lintr::lint_package()).After the initial Pull Request