MixedModels Example in docs (#15)

* MixedModels Example in docs * YAS * layout tidyup
palday · Aug 31, 2023 · 29731f4 · 29731f4 · palday · Aug 31, 2023
1 parent 5124b7a
commit 29731f4
Show file tree

Hide file tree

Showing 5 changed files with 193 additions and 9 deletions.
diff --git a/docs/Project.toml b/docs/Project.toml
@@ -2,6 +2,7 @@
 CairoMakie = "13f3f980-e62b-5c42-98c6-ff1f3baf88f0"
 DocStringExtensions = "ffbed154-4ef7-542d-bbb7-c09d3a79fcae"
 Documenter = "e30172f5-a6a5-5a46-863b-614d45cd2de4"
+MixedModels = "ff71e718-51f3-5ec2-a782-8ffcbfa3c316"
 RDatasets = "ce6b1742-4840-55fa-b093-852dadbb1d8b"
 Random = "9a3f8284-a2c9-5f02-9a11-845980a1fd5c"
 StatsModels = "3eaba693-59b7-5ba5-a881-562e759f1c8d"

diff --git a/docs/make.jl b/docs/make.jl
@@ -6,6 +6,8 @@ makedocs(; root=joinpath(dirname(pathof(BoxCox)), "..", "docs"),
          sitename="BoxCox",
          doctest=true,
          strict=true,
-         pages=["index.md", "api.md"])
+         pages=["index.md",
+                "mixed-models.md",
+                "api.md"])
 
 deploydocs(; repo="github.com/palday/BoxCox.jl", push_preview=true, devbranch="main")
diff --git a/docs/src/index.md b/docs/src/index.md
@@ -19,11 +19,19 @@ using BoxCox
 using CairoMakie
 using Random
 x = abs2.(randn(MersenneTwister(42), 1000))
-hist(x)
+let f = Figure(; resolution=(400, 400))
+    ax = Axis(f[1,1]; xlabel="x", ylabel="density")
+    density!(ax, x)
+    f
+end
 ```
 
 ```@example Unconditional
-qqnorm(x)
+let f = Figure(; resolution=(400, 400))
+    ax = Axis(f[1,1]; xlabel="theoretical", ylabel="observed")
+    qqnorm!(ax, x)
+    f
+end
 ```
 
 We fit the Box-Cox transform.
@@ -36,13 +44,21 @@ Note that the resulting transform isn't exactly a square root, even though our d
 Now that we've fit the transform, we use it like a function to transform the original data.
 
 ```@example Unconditional
-hist(bc.(x))
+let f = Figure(; resolution=(400, 400))
+    ax = Axis(f[1,1]; xlabel="x", ylabel="density")
+    density!(ax, bc.(x))
+    f
+end
 ```
 
 There is also a special method for `qqnorm` provided for objects of type `BoxCoxTransformation`, which shows the QQ plot of the transformation applied to the original data.
 
 ```@example Unconditional
-qqnorm(bc)
+let f = Figure(; resolution=(400, 400))
+    ax = Axis(f[1,1]; xlabel="theoretical", ylabel="observed")
+    qqnorm!(ax, bc)
+    f
+end
 ```
 
 We can also generate a diagnostic plot to see how well other parameter values would have worked for normalizing the data.
@@ -51,7 +67,7 @@ We can also generate a diagnostic plot to see how well other parameter values wo
 boxcoxplot(bc)
 ```
 
-The vertical line corresponds to the final parameter estimate. 
+The vertical line corresponds to the final parameter estimate.
 
 If we specify a confidence level, then an additional horizontal line is added, which crosses the likelihood profile at the points corresponding to the edge of the confidence interval.
 
@@ -70,7 +86,10 @@ using CairoMakie
 using RDatasets: dataset as rdataset
 using StatsModels
 
+CairoMakie.activate!(; type="svg")
+
 trees = rdataset("datasets", "trees")
+first(trees, 5)
 ```
 
 For the conditional distribution, we want to fit a linear regression to the transformed response values and then evaluate the profile likelihood of the transformation. If the StatsModels package has been loaded, either directly or indirectly (e.g. via loading GLM.jl or MixedModels.jl), then a formula interface is available. (Otherwise, the model matrix and the response have to specified separately.)
@@ -84,9 +103,12 @@ We can do all the same diagnostics as previously:
 ```@example Conditional
 let f = Figure()
     ax = Axis(f[1, 1]; title="Raw")
-    hist!(ax, trees.Volume)
+    density!(ax, trees.Volume)
     ax = Axis(f[1, 2]; title="Transformed")
-    hist!(ax, bc.(trees.Volume))
+    density!(ax, bc.(trees.Volume))
+    colsize!(f.layout, 1, Aspect(1, 1.0))
+    colsize!(f.layout, 2, Aspect(1, 1.0))
+    resize_to_layout!(f)
     f
 end
 ```
@@ -96,7 +118,10 @@ let f = Figure()
     ax = Axis(f[1, 1]; title="Raw", aspect=1)
     qqnorm!(ax, trees.Volume; qqline=:fitrobust)
     ax = Axis(f[1, 2]; title="Transformed", aspect=1)
-    qqnorm!(ax, bc)
+    qqnorm!(ax, bc.(trees.Volume); qqline=:fitrobust)
+    colsize!(f.layout, 1, Aspect(1, 1.0))
+    colsize!(f.layout, 2, Aspect(1, 1.0))
+    resize_to_layout!(f)
     f
 end
 ```

diff --git a/docs/src/mixed-models.md b/docs/src/mixed-models.md
@@ -0,0 +1,145 @@
+# MixedModels.jl integration
+
+```@meta
+CurrentModule = BoxCox
+```
+
+BoxCox.jl supports finding fitting the Box-Cox transformation to a `LinearMixedModel` from MixedModels.jl.
+On Julia 1.9 and above, this support is done via a package extension and so the user pays no dependency or precompilation penalty when this functionality is not used.
+On Julia 1.6 to 1.8, this functionality is defined unconditionally (thus incurring the dependency and precompilation penalty), but neither the `@formula` macro nor the `MixedModel` type are re-exported, so MixedModels.jl must still be loaded to use this functionality.
+
+
+Let us examine the classic sleepstudy dataset from MixedModels.jl. First, we load the necessary packages.
+
+```@example Mixed
+using BoxCox
+using CairoMakie
+using MixedModels
+using MixedModels: dataset
+CairoMakie.activate!(; type="svg")
+```
+
+Then we fit the traditional model used reaction time as our dependent variable:
+
+```@example Mixed
+model = fit(MixedModel,
+            @formula(reaction ~ 1 + days + (1 + days | subj)),
+            dataset(:sleepstudy))
+```
+
+
+## Fitting the Box-Cox transformation
+
+While this model does perform well overall, we can also examine whether the Box-Cox transformation suggests a transformation of the response.
+
+```@example Mixed
+bc = fit(BoxCoxTransformation, model)
+```
+
+!!! note
+    For large models, fitting the `BoxCoxTransformation` can take a while because a mixed model must be repeatedly fit after each intermediate transformation.
+
+
+## Choosing an appropriate transformation
+
+Although we receive a single "best" value (approximately -1.0747)  the fitting process, it is worthwhile to look at the profile likelihood plot for the transformation:
+
+```@example Mixed
+boxcoxplot(bc; conf_level=0.95)
+```
+
+Here we see that -1 is nearly as good. Moreover, ``\text{time}^-1`` has a natural interpretation as *speed*.
+In other words, we can model reaction speed instead of reaction time.
+Then instead of seeing whether participants take longer to respond with each passing day, we can see whether their speed increases or decreases.
+In both cases, we're looking at whether they respond *faster* or *slower* and even the terminology *fast* and *slow* suggests that speed is easily interpretable.
+
+Now, the formal definition of the Box-Cox transformation is:
+
+```math
+\begin{cases}
+\frac{x^{\lambda} - 1}{\lambda} &\quad \lambda \neq 0 \\
+\log x &\quad \lambda = 0
+\end{cases}
+```
+
+In other words, there is a normalizing denominator that flips the sign when ``\lambda < 0``.
+If we use the full Box-Cox formula, then the sign of the effect in our transformed and untransformed model remains the same.
+While useful at times, speed has a natural interpretation and so we instead use the power relation, which is the actual key component, without normalization.
+
+
+## Fitting a model to the transormed response
+
+```@example Mixed
+model_bc = fit(MixedModel,
+               @formula(1 / reaction ~ 1 + days + (1 + days | subj)),
+                dataset(:sleepstudy))
+```
+
+Finally, let's take a look at our the residual diagnostics for our transformed and untransformed models:
+
+## Impact of transformation
+
+```@example Mixed
+let f = Figure()
+    ax = Axis(f[1, 1]; title="Speed", aspect=1)
+    density!(ax, residuals(model))
+    ax = Axis(f[1, 2]; title="Reaction Time", aspect=1)
+    density!(ax, residuals(model_bc))
+    colsize!(f.layout, 1, Aspect(1, 1.0))
+    colsize!(f.layout, 2, Aspect(1, 1.0))
+    resize_to_layout!(f)
+    f
+end
+```
+
+### QQ plots
+
+```@example Mixed
+let f = Figure()
+    ax = Axis(f[1, 1]; title="Reaction Time", aspect=1)
+    qqnorm!(ax, residuals(model); qqline=:fitrobust)
+    ax = Axis(f[1, 2]; title="Speed", aspect=1)
+    qqnorm!(ax, residuals(model_bc); qqline=:fitrobust)
+    colsize!(f.layout, 1, Aspect(1, 1.0))
+    colsize!(f.layout, 2, Aspect(1, 1.0))
+    resize_to_layout!(f)
+    f
+end
+```
+
+### Fitted vs residual
+
+```@example Mixed
+let f = Figure()
+    ax = Axis(f[1, 1]; title="Reaction Time", aspect=1, xlabel="Fitted", ylabel="Residual")
+    scatter!(ax, fitted(model), residuals(model))
+    hlines!(ax, 0; linestyle=:dash, color=:black)
+    ax = Axis(f[1, 2]; title="Speed", aspect=1, xlabel="Fitted", ylabel="Residual")
+    scatter!(ax, fitted(model_bc), residuals(model_bc))
+    hlines!(ax, 0; linestyle=:dash, color=:black)
+    colsize!(f.layout, 1, Aspect(1, 1.0))
+    colsize!(f.layout, 2, Aspect(1, 1.0))
+    resize_to_layout!(f)
+    f
+end
+```
+
+### Fitted vs observed
+
+```@example Mixed
+let f = Figure()
+    ax = Axis(f[1, 1]; title="Reaction Time", aspect=1)
+    scatter!(ax, fitted(model), response(model), xlabel="Fitted", ylabel="Observed")
+    ablines!(ax, 0, 1; linestyle=:dash, color=:black)
+    ax = Axis(f[1, 2]; title="Speed", aspect=1, xlabel="Fitted", ylabel="Observed")
+    scatter!(ax, fitted(model_bc), response(model_bc))
+    ablines!(ax, 0, 1; linestyle=:dash, color=:black)
+    colsize!(f.layout, 1, Aspect(1, 1.0))
+    colsize!(f.layout, 2, Aspect(1, 1.0))
+    resize_to_layout!(f)
+    f
+end
+```
+
+All together, this suggests that using speed instead of reaction time does indeed improve the quality of the model fit, even though the fit was already very good for reaction time.
+This example also highlighted the importance of the analyst's discretion: we choose a slightly different transformation than the originally estimated Box-Cox transformation in order to yield a model on a naturally interpretable scale.
diff --git a/src/BoxCox.jl b/src/BoxCox.jl
@@ -95,6 +95,17 @@ end
 
 Compute the Box-Cox transformation of x for the parameter value λ.
 
+The Box-Cox transformation is defined as:
+
+```math
+\\begin{cases}
+\\frac{x^{\\lambda} - 1}{\\lambda} &\\quad \\lambda \\neq 0 \\\\
+\\log x &\\quad \\lambda = 0
+\\end{cases}
+```
+
+for positive ``x``. (If ``x <= 0``, then ``x`` must first be translated to be strictly positive.)
+
 `atol` controls the absolute tolerance for treating λ as zero.
 
 The one argument variant curries and creates a one-argument function of `x` for the given λ.