Skip to content

Commit

Permalink
MixedModels Example in docs (#15)
Browse files Browse the repository at this point in the history
* MixedModels Example in docs

* YAS

* layout tidyup
  • Loading branch information
palday authored Aug 31, 2023
1 parent 5124b7a commit 29731f4
Show file tree
Hide file tree
Showing 5 changed files with 193 additions and 9 deletions.
1 change: 1 addition & 0 deletions docs/Project.toml
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
CairoMakie = "13f3f980-e62b-5c42-98c6-ff1f3baf88f0"
DocStringExtensions = "ffbed154-4ef7-542d-bbb7-c09d3a79fcae"
Documenter = "e30172f5-a6a5-5a46-863b-614d45cd2de4"
MixedModels = "ff71e718-51f3-5ec2-a782-8ffcbfa3c316"
RDatasets = "ce6b1742-4840-55fa-b093-852dadbb1d8b"
Random = "9a3f8284-a2c9-5f02-9a11-845980a1fd5c"
StatsModels = "3eaba693-59b7-5ba5-a881-562e759f1c8d"
Expand Down
4 changes: 3 additions & 1 deletion docs/make.jl
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,8 @@ makedocs(; root=joinpath(dirname(pathof(BoxCox)), "..", "docs"),
sitename="BoxCox",
doctest=true,
strict=true,
pages=["index.md", "api.md"])
pages=["index.md",
"mixed-models.md",
"api.md"])

deploydocs(; repo="github.com/palday/BoxCox.jl", push_preview=true, devbranch="main")
41 changes: 33 additions & 8 deletions docs/src/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,11 +19,19 @@ using BoxCox
using CairoMakie
using Random
x = abs2.(randn(MersenneTwister(42), 1000))
hist(x)
let f = Figure(; resolution=(400, 400))
ax = Axis(f[1,1]; xlabel="x", ylabel="density")
density!(ax, x)
f
end
```

```@example Unconditional
qqnorm(x)
let f = Figure(; resolution=(400, 400))
ax = Axis(f[1,1]; xlabel="theoretical", ylabel="observed")
qqnorm!(ax, x)
f
end
```

We fit the Box-Cox transform.
Expand All @@ -36,13 +44,21 @@ Note that the resulting transform isn't exactly a square root, even though our d
Now that we've fit the transform, we use it like a function to transform the original data.

```@example Unconditional
hist(bc.(x))
let f = Figure(; resolution=(400, 400))
ax = Axis(f[1,1]; xlabel="x", ylabel="density")
density!(ax, bc.(x))
f
end
```

There is also a special method for `qqnorm` provided for objects of type `BoxCoxTransformation`, which shows the QQ plot of the transformation applied to the original data.

```@example Unconditional
qqnorm(bc)
let f = Figure(; resolution=(400, 400))
ax = Axis(f[1,1]; xlabel="theoretical", ylabel="observed")
qqnorm!(ax, bc)
f
end
```

We can also generate a diagnostic plot to see how well other parameter values would have worked for normalizing the data.
Expand All @@ -51,7 +67,7 @@ We can also generate a diagnostic plot to see how well other parameter values wo
boxcoxplot(bc)
```

The vertical line corresponds to the final parameter estimate.
The vertical line corresponds to the final parameter estimate.

If we specify a confidence level, then an additional horizontal line is added, which crosses the likelihood profile at the points corresponding to the edge of the confidence interval.

Expand All @@ -70,7 +86,10 @@ using CairoMakie
using RDatasets: dataset as rdataset
using StatsModels
CairoMakie.activate!(; type="svg")
trees = rdataset("datasets", "trees")
first(trees, 5)
```

For the conditional distribution, we want to fit a linear regression to the transformed response values and then evaluate the profile likelihood of the transformation. If the StatsModels package has been loaded, either directly or indirectly (e.g. via loading GLM.jl or MixedModels.jl), then a formula interface is available. (Otherwise, the model matrix and the response have to specified separately.)
Expand All @@ -84,9 +103,12 @@ We can do all the same diagnostics as previously:
```@example Conditional
let f = Figure()
ax = Axis(f[1, 1]; title="Raw")
hist!(ax, trees.Volume)
density!(ax, trees.Volume)
ax = Axis(f[1, 2]; title="Transformed")
hist!(ax, bc.(trees.Volume))
density!(ax, bc.(trees.Volume))
colsize!(f.layout, 1, Aspect(1, 1.0))
colsize!(f.layout, 2, Aspect(1, 1.0))
resize_to_layout!(f)
f
end
```
Expand All @@ -96,7 +118,10 @@ let f = Figure()
ax = Axis(f[1, 1]; title="Raw", aspect=1)
qqnorm!(ax, trees.Volume; qqline=:fitrobust)
ax = Axis(f[1, 2]; title="Transformed", aspect=1)
qqnorm!(ax, bc)
qqnorm!(ax, bc.(trees.Volume); qqline=:fitrobust)
colsize!(f.layout, 1, Aspect(1, 1.0))
colsize!(f.layout, 2, Aspect(1, 1.0))
resize_to_layout!(f)
f
end
```
Expand Down
145 changes: 145 additions & 0 deletions docs/src/mixed-models.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,145 @@
# MixedModels.jl integration

```@meta
CurrentModule = BoxCox
```

BoxCox.jl supports finding fitting the Box-Cox transformation to a `LinearMixedModel` from MixedModels.jl.
On Julia 1.9 and above, this support is done via a package extension and so the user pays no dependency or precompilation penalty when this functionality is not used.
On Julia 1.6 to 1.8, this functionality is defined unconditionally (thus incurring the dependency and precompilation penalty), but neither the `@formula` macro nor the `MixedModel` type are re-exported, so MixedModels.jl must still be loaded to use this functionality.


Let us examine the classic sleepstudy dataset from MixedModels.jl. First, we load the necessary packages.

```@example Mixed
using BoxCox
using CairoMakie
using MixedModels
using MixedModels: dataset
CairoMakie.activate!(; type="svg")
```

Then we fit the traditional model used reaction time as our dependent variable:

```@example Mixed
model = fit(MixedModel,
@formula(reaction ~ 1 + days + (1 + days | subj)),
dataset(:sleepstudy))
```


## Fitting the Box-Cox transformation

While this model does perform well overall, we can also examine whether the Box-Cox transformation suggests a transformation of the response.

```@example Mixed
bc = fit(BoxCoxTransformation, model)
```

!!! note
For large models, fitting the `BoxCoxTransformation` can take a while because a mixed model must be repeatedly fit after each intermediate transformation.


## Choosing an appropriate transformation

Although we receive a single "best" value (approximately -1.0747) the fitting process, it is worthwhile to look at the profile likelihood plot for the transformation:

```@example Mixed
boxcoxplot(bc; conf_level=0.95)
```

Here we see that -1 is nearly as good. Moreover, ``\text{time}^-1`` has a natural interpretation as *speed*.
In other words, we can model reaction speed instead of reaction time.
Then instead of seeing whether participants take longer to respond with each passing day, we can see whether their speed increases or decreases.
In both cases, we're looking at whether they respond *faster* or *slower* and even the terminology *fast* and *slow* suggests that speed is easily interpretable.

Now, the formal definition of the Box-Cox transformation is:

```math
\begin{cases}
\frac{x^{\lambda} - 1}{\lambda} &\quad \lambda \neq 0 \\
\log x &\quad \lambda = 0
\end{cases}
```

In other words, there is a normalizing denominator that flips the sign when ``\lambda < 0``.
If we use the full Box-Cox formula, then the sign of the effect in our transformed and untransformed model remains the same.
While useful at times, speed has a natural interpretation and so we instead use the power relation, which is the actual key component, without normalization.


## Fitting a model to the transormed response

```@example Mixed
model_bc = fit(MixedModel,
@formula(1 / reaction ~ 1 + days + (1 + days | subj)),
dataset(:sleepstudy))
```

Finally, let's take a look at our the residual diagnostics for our transformed and untransformed models:

## Impact of transformation

```@example Mixed
let f = Figure()
ax = Axis(f[1, 1]; title="Speed", aspect=1)
density!(ax, residuals(model))
ax = Axis(f[1, 2]; title="Reaction Time", aspect=1)
density!(ax, residuals(model_bc))
colsize!(f.layout, 1, Aspect(1, 1.0))
colsize!(f.layout, 2, Aspect(1, 1.0))
resize_to_layout!(f)
f
end
```

### QQ plots

```@example Mixed
let f = Figure()
ax = Axis(f[1, 1]; title="Reaction Time", aspect=1)
qqnorm!(ax, residuals(model); qqline=:fitrobust)
ax = Axis(f[1, 2]; title="Speed", aspect=1)
qqnorm!(ax, residuals(model_bc); qqline=:fitrobust)
colsize!(f.layout, 1, Aspect(1, 1.0))
colsize!(f.layout, 2, Aspect(1, 1.0))
resize_to_layout!(f)
f
end
```

### Fitted vs residual

```@example Mixed
let f = Figure()
ax = Axis(f[1, 1]; title="Reaction Time", aspect=1, xlabel="Fitted", ylabel="Residual")
scatter!(ax, fitted(model), residuals(model))
hlines!(ax, 0; linestyle=:dash, color=:black)
ax = Axis(f[1, 2]; title="Speed", aspect=1, xlabel="Fitted", ylabel="Residual")
scatter!(ax, fitted(model_bc), residuals(model_bc))
hlines!(ax, 0; linestyle=:dash, color=:black)
colsize!(f.layout, 1, Aspect(1, 1.0))
colsize!(f.layout, 2, Aspect(1, 1.0))
resize_to_layout!(f)
f
end
```

### Fitted vs observed

```@example Mixed
let f = Figure()
ax = Axis(f[1, 1]; title="Reaction Time", aspect=1)
scatter!(ax, fitted(model), response(model), xlabel="Fitted", ylabel="Observed")
ablines!(ax, 0, 1; linestyle=:dash, color=:black)
ax = Axis(f[1, 2]; title="Speed", aspect=1, xlabel="Fitted", ylabel="Observed")
scatter!(ax, fitted(model_bc), response(model_bc))
ablines!(ax, 0, 1; linestyle=:dash, color=:black)
colsize!(f.layout, 1, Aspect(1, 1.0))
colsize!(f.layout, 2, Aspect(1, 1.0))
resize_to_layout!(f)
f
end
```

All together, this suggests that using speed instead of reaction time does indeed improve the quality of the model fit, even though the fit was already very good for reaction time.
This example also highlighted the importance of the analyst's discretion: we choose a slightly different transformation than the originally estimated Box-Cox transformation in order to yield a model on a naturally interpretable scale.
11 changes: 11 additions & 0 deletions src/BoxCox.jl
Original file line number Diff line number Diff line change
Expand Up @@ -95,6 +95,17 @@ end
Compute the Box-Cox transformation of x for the parameter value λ.
The Box-Cox transformation is defined as:
```math
\\begin{cases}
\\frac{x^{\\lambda} - 1}{\\lambda} &\\quad \\lambda \\neq 0 \\\\
\\log x &\\quad \\lambda = 0
\\end{cases}
```
for positive ``x``. (If ``x <= 0``, then ``x`` must first be translated to be strictly positive.)
`atol` controls the absolute tolerance for treating λ as zero.
The one argument variant curries and creates a one-argument function of `x` for the given λ.
Expand Down

2 comments on commit 29731f4

@palday
Copy link
Owner Author

@palday palday commented on 29731f4 Aug 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JuliaRegistrator
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Registration pull request created: JuliaRegistries/General/90541

After the above pull request is merged, it is recommended that a tag is created on this repository for the registered package version.

This will be done automatically if the Julia TagBot GitHub Action is installed, or can be done manually through the github interface, or via:

git tag -a v0.2.0 -m "<description of version>" 29731f4283711a8f6e66adc849aa4f0b3c882736
git push origin v0.2.0

Please sign in to comment.