@@ -132,6 +132,7 @@ model {
132
132
y ~ normal(theta, sigma);
133
133
}
134
134
```
135
+
135
136
The centered and non-centered are two parameterizations of the same statistical
136
137
model, but they have very different practical implications for MCMC. Using the
137
138
__ bayesplot__ diagnostic plots, we'll see that, for this data, the NCP is
@@ -157,9 +158,9 @@ posterior_cp <- as.array(fit_cp)
157
158
posterior_ncp <- as.array(fit_ncp)
158
159
```
159
160
160
- For now ignore any warnings about divergent transitions after warmup . We will
161
- come back to those later in the vignette in the [ Diagnostics for the No-U-Turn
162
- Sampler ] (#diagnostics-for-the no-u-turn-sampler) section.
161
+ For now ignore any warnings issued by the sampler . We will come back to them
162
+ later in the [ Diagnostics for the No-U-Turn Sampler ] (#diagnostics-for-the
163
+ no-u-turn-sampler) section.
163
164
164
165
165
166
### Rhat: potential scale reduction statistic
@@ -181,23 +182,24 @@ First we'll quickly fit one of the models above again, this time intentionally
181
182
using too few MCMC iterations. This should lead to some high $\hat{R}$ values.
182
183
183
184
``` {r, results='hide'}
184
- fit_cp_50iter <- sampling(schools_mod_cp, data = schools_dat, chains = 2, iter = 50)
185
+ fit_cp_100iter <- sampling(schools_mod_cp, data = schools_dat,
186
+ chains = 2, iter = 100)
185
187
```
186
188
187
- ** bayesplot** also provides a generic ` rhat ` extractor function,
188
- currently with methods defined for models fit using the ** rstan** and
189
- ** rstanarm ** packages. But regardless of how you fit your model, all
190
- ** bayesplot ** needs is a vector of $\hat{R}$ values.
189
+ ** bayesplot** provides a generic ` rhat ` extractor function, currently with
190
+ methods defined for models fit using the ** rstan** and ** rstanarm ** packages.
191
+ But regardless of how you fit your model, all ** bayesplot ** needs is a vector of
192
+ $\hat{R}$ values.
191
193
192
194
``` {r print-rhats}
193
195
library("bayesplot")
194
- rhats <- rhat(fit_cp_50iter )
196
+ rhats <- rhat(fit_cp_100iter )
195
197
print(rhats)
196
198
```
197
199
198
200
We can visualize the $\hat{R}$ values with the ` mcmc_rhat ` function:
199
201
200
- ``` {r mcmc_rhat}
202
+ ``` {r mcmc_rhat-1 }
201
203
color_scheme_set("brightblue") # see help("color_scheme_set")
202
204
mcmc_rhat(rhats)
203
205
```
@@ -206,21 +208,21 @@ In the plot, the points representing the $\hat{R}$ values are colored based on
206
208
whether they are less than $1.05$, between $1.05$ and $1.1$, or greater than
207
209
$1.1$.
208
210
209
- We can see the names of the parameters with the concerning $\hat{R}$ values by
210
- turning on the $y$-axis text using the ` yaxis_text ` convenience function:
211
+ The axis $y$-axis text is off by default for this plot because it's only
212
+ possible to see the labels clearly for models with very few parameters. We can
213
+ see the names of the parameters with the concerning $\hat{R}$ values using the
214
+ ` yaxis_text ` convenience function (which passes arguments like ` hjust ` to
215
+ ` ggplot2::element_text ` ):
211
216
212
217
``` {r, mcmc_rhat-2}
213
- mcmc_rhat(rhats) + yaxis_text()
218
+ mcmc_rhat(rhats) + yaxis_text(hjust = 1 )
214
219
```
215
220
216
- The axis $y$-axis text is off by default for this plot because it's only
217
- possible to see the labels clearly for models with very few parameters.
218
-
219
221
If we look at the same model fit using longer Markov chains we should see all
220
222
$\hat{R} < 1.1$, and all points in the plot the same (light) color:
221
223
222
- ``` {r, results='hide' }
223
- mcmc_rhat(rhat = rhat(fit_cp)) + yaxis_text()
224
+ ``` {r, mcmc_rhat-3 }
225
+ mcmc_rhat(rhat = rhat(fit_cp)) + yaxis_text(hjust = 0 )
224
226
```
225
227
226
228
We can see the same information shown by ` mcmc_rhat ` but in histogram form using
@@ -239,12 +241,13 @@ larger the ratio of $n_{eff}$ to $N$ the better.
239
241
The ** bayesplot** package provides a generic ` neff_ratio ` extractor function,
240
242
currently with methods defined for models fit using the ** rstan** and
241
243
** rstanarm** packages. But regardless of how you fit your model, all
242
- ** bayesplot** needs is a vector of $n_ {eff}/N$ values. The ` mcmc_neff ` and ` mcmc_neff_hist ` can then be used to plot the ratios.
244
+ ** bayesplot** needs is a vector of $n_ {eff}/N$ values. The ` mcmc_neff ` and
245
+ ` mcmc_neff_hist ` can then be used to plot the ratios.
243
246
244
247
``` {r print-neff-ratios}
245
248
ratios_cp <- neff_ratio(fit_cp)
246
249
print(ratios_cp)
247
- mcmc_neff(ratios_cp)
250
+ mcmc_neff(ratios_cp, size = 2 )
248
251
```
249
252
250
253
In the plot, the points representing the values of $n_ {eff}/N$ are colored based
@@ -261,13 +264,14 @@ much lower autocorrelations compared to draws obtained using other MCMC
261
264
algorithms (e.g., Gibbs).
262
265
263
266
Even for models fit using ** rstan** the parameterization can make a big
264
- difference. Here are the $n_ {eff}/N$ plots for ` fit_cp ` and ` fit_ncp ` side by side.
267
+ difference. Here are the $n_ {eff}/N$ plots for ` fit_cp ` and ` fit_ncp `
268
+ side by side.
265
269
266
- ``` {r mcmc_neff-compare, fig.width=7 }
270
+ ``` {r mcmc_neff-compare}
267
271
# A function we'll use several times to plot comparisons of the centered
268
272
# parameterization (cp) and the non-centered parameterization (ncp). See
269
273
# help("bayesplot_grid") for details on the bayesplot_grid function used here.
270
- compare_cp_ncp <- function(cp_plot, ncp_plot, ncol = 1 ) {
274
+ compare_cp_ncp <- function(cp_plot, ncp_plot, ncol = 2 ) {
271
275
bayesplot_grid(
272
276
cp_plot, ncp_plot,
273
277
grid_args = list(ncol = ncol),
@@ -278,7 +282,7 @@ compare_cp_ncp <- function(cp_plot, ncp_plot, ncol = 1) {
278
282
279
283
neff_cp <- neff_ratio(fit_cp, pars = c("theta", "mu", "tau"))
280
284
neff_ncp <- neff_ratio(fit_ncp, pars = c("theta", "mu", "tau"))
281
- compare_cp_ncp(mcmc_neff(neff_cp), mcmc_neff(neff_ncp))
285
+ compare_cp_ncp(mcmc_neff(neff_cp), mcmc_neff(neff_ncp), ncol = 1 )
282
286
```
283
287
284
288
Because of the difference in parameterization, the effective sample sizes are
@@ -295,14 +299,13 @@ user-specified number of lags.
295
299
Here we can again see a difference when comparing the two parameterizations of
296
300
the same model. For model 1, $\theta_1$ is the primitive parameter for school 1,
297
301
whereas for the non-centered parameterization in model 2 the primitive parameter
298
- is $\eta_1$ (and $\theta_1$ is later constructed from $\eta_1$, $\mu$, and
299
- $\tau$):
302
+ is $\eta_1$ (and $\theta_1$ is later constructed from $\eta_1$, $\mu$,
303
+ and $\tau$):
300
304
301
305
``` {r mcmc_acf}
302
306
compare_cp_ncp(
303
307
mcmc_acf(posterior_cp, pars = "theta[1]", lags = 10),
304
- mcmc_acf(posterior_ncp, pars = "eta[1]", lags = 10),
305
- ncol = 2
308
+ mcmc_acf(posterior_ncp, pars = "eta[1]", lags = 10)
306
309
)
307
310
```
308
311
@@ -358,7 +361,7 @@ there is a cluster of many red marks:
358
361
``` {r, mcmc_trace}
359
362
color_scheme_set("mix-brightblue-gray")
360
363
mcmc_trace(posterior_cp, pars = "tau", divergences = np_cp) +
361
- xlab("Post-warmup Iteration ")
364
+ xlab("Post-warmup iteration ")
362
365
```
363
366
364
367
To look deeper at the information conveyed by the divergences we can use the
@@ -436,8 +439,7 @@ case for the non-centered parameterization (right):
436
439
``` {r, mcmc_nuts_energy-3, message=FALSE, fig.width=8}
437
440
compare_cp_ncp(
438
441
mcmc_nuts_energy(np_cp, binwidth = 1/2),
439
- mcmc_nuts_energy(np_ncp, binwidth = 1/2),
440
- ncol = 2
442
+ mcmc_nuts_energy(np_ncp, binwidth = 1/2)
441
443
)
442
444
```
443
445
@@ -448,10 +450,10 @@ posterior:
448
450
``` {r, mcmc_nuts_energy-4, message=FALSE, fig.width=8}
449
451
np_cp_2 <- nuts_params(fit_cp_2)
450
452
np_ncp_2 <- nuts_params(fit_ncp_2)
453
+
451
454
compare_cp_ncp(
452
455
mcmc_nuts_energy(np_cp_2),
453
- mcmc_nuts_energy(np_ncp_2),
454
- ncol = 2
456
+ mcmc_nuts_energy(np_ncp_2)
455
457
)
456
458
```
457
459
0 commit comments