-
-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve support (or documentation) for weighted draws #184
Comments
Thanks, I think there's a couple of different issues here.
It is, I think you were using it on an > x = rvar_rng(rnorm, 4, mean = 1:4, sd = 2)
> w = runif(4000)
> x
rvar<4000>[4] mean ± sd:
[1] 1 ± 2 2 ± 2 3 ± 2 4 ± 2
>
> d = draws_rvars(x = x)
> d
# A draws_rvars: 4000 iterations, 1 chains, and 1 variables
$x: rvar<4000>[4] mean ± sd:
[1] 1 ± 2 2 ± 2 3 ± 2 4 ± 2
>
> weight_draws(d, w)
# A draws_rvars: 4000 iterations, 1 chains, and 1 variables
$x: rvar<4000>[4] mean ± sd:
[1] 1 ± 2 2 ± 2 3 ± 2 4 ± 2
# ... hidden reserved variables {'.log_weight'} Perhaps we could have a more useful error message here directing people to put the > dw = weight_draws(d, w)
> dw$.log_weight
rvar<4000>[1] mean ± sd:
[1] -1 ± 1 Quite possibly this is not the best approach. I think we could allow
Thanks for catching this, that is a bug --- the weights column should be carried over but is not.
Yeah, the This raises another question, which is what to do with weighted draws objects in
It that example you are asking for the mean of the entire > summarise_draws(d, mean)
# A tibble: 4 x 2
variable mean
<chr> <dbl>
1 x[1] 1.02
2 x[2] 1.99
3 x[3] 2.97
4 x[4] 3.97
> lapply(d, mean)
$x
[1] 1.016734 1.986222 2.972354 3.970131 |
Thanks for the clarifications. Support for weights would be cool and useful, but as it's also complicated, I'm fine that they are not supported. We could instead add explicit functions and examples of using weights. E.g. loo package has E_loo https://mc-stan.org/loo/reference/E_loo.html. With explicit functions it's possible to make it more clear what is expected to work with weights. |
I looked into this more for For |
Thank you both for discussing this issue. I have no good opinion on whether we shall support weights in With regard to |
Another worry I have is that if posterior would work well with weights, that it could be a huge work to get, e.g., ggdist to work with weights, too. That is, there is a danger of endless work to extend support for weighted draws if one part works too well. I still think support for weighted draws would be cool, but I feel it would be easier to start with having functions that the user needs to explicitly use. The idea of including weights as .log_weights was to make it easier to delay the resampling step as that step loses useful information (including diagnostics on the distribution of the weights), but rvars is changing the expectations of the users a lot. |
Yeah, unless/until we have time, maybe the short term solution is explicit functions paired with some warnings or messages when people use summarise_draws or the rvar summary functions that summaries do not account for weights? |
As part of priorsense I've implemented weighted functions to match
If it's decided to directly support summaries of weighted draws, the |
That's an interesting solution @n-kall. @paul-buerkner would a default approach for weighted draws be to pass through a weights argument? Do most of the diagnostic functions have |
That could be an option, yes. Not sure yet though if it would be a good idea to add |
Since @mjskay has implemented weighted rvars (#331), I have been revisited the idea of passing Initial testing seems to indicate that this works without breaking anything, so I am hopeful that this could be a viable way to properly support weights (it would also allow for addressing #341). |
Cool! The core is this commit? Looks pretty straightforward. I wonder about what we want the long term plan to be: do we want folks to migrate to the version that uses
Personally, long-term I'd rather folks not have to manually set an option to get summaries that take weights into account, so some solution that gets us there would be ideal. |
Yea, that's the main change. I think it's fairly unlikely that users have been using weights and then unweighted summaries expecting weights to be ignored, so I think (at least) auto detecting should be fine. Although I didn't check what could happen if an rvar is passed to a custom summary function that expects an array |
Ok, so if one uses a summary function that does not support summarise_draws(x, stats::sd, .use_rvars = TRUE)
# Error in `as.double()`:
# ! Can't convert `x` <rvar<100,4>> to <double>. |
What about three values to
|
I like the idea. Having 3 options might be a little confusing for users but if the default is sensible most of the time, the TRUE and FALSE options would only be needed for special cases / backward compatibility. We should be mindful of how this could scale with many variables and several summary functions, particularly catching errors and potential warnings/messages. I can try to spin something up early next week and see how it feels. |
Cool thanks!
Yeah good point. Ideally people should not be using code with the warnings in place, but actually fixing it so the warnings aren't there.... of course we all know that may be wishful thinking.... |
?weight_draws
saysAdd weights to draws objects, with one weight per draw, for use in subsequent weighting operations.
Currently the only supported subsequent operation isresample_draws()
, but the documentation doesn't make it clear. Ideallyrvars
support would be extended to support weighted draws, but if that is infeasible, the documentation should be clarified.Example:
weight_draws()
is not defined for draws_rvars:There is no error if weighted draws_df is converted to draws_rvars:
The default print method gives the unweighted result
and mean fails
The text was updated successfully, but these errors were encountered: