-
Notifications
You must be signed in to change notification settings - Fork 1
/
bayesian_example_analytical.Rmd
149 lines (100 loc) · 4.88 KB
/
bayesian_example_analytical.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
---
title: "Step by step example for Bayesian inference: analytical solution"
author: "Fabienne Krauer"
date: "2022-09-05"
output: html_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
## The question
Let's assume we have a map of the world, and we want to figure out what proportion p is covered by water.
## The data
Do answer the question, we close our eyes and randomly point at the map and write down whether it is Water (0) or Land (1). We get the following data:
```{r}
data = c(0,0,1,0,1,1,0,1,1,1,0,0,1,0,0,0,1,0,1,0,1,0,0,1,0,0,1,1,0,0,1,0,1,0,0,1,0,1,0,0,0,0)
table(data)
```
## The likelihood
The likelihood function that describes the process of multiple Bernoulli events (0 or 1) is a Binomial likelihood.
The parametrization requires to input the number of trials (=data points) and the number of successes (=n of data points with the value of interest (water)):
```{r}
loglik <- function(data, p) {
trials = length(data)
success = length(data[data==0])
ll = dbinom(success, trials, p)
return(ll)
}
```
Let's visualize the log-likelihood profile for a series of probabilities p = proportion of the globe covered with water:
```{r}
# parameters to check
parametervalues <- seq(0,1,0.001)
# get the probability for all parametervalues
loglik_profile <- loglik(data, parametervalues)
# plot results
plot(parametervalues, loglik_profile, type = "l")
legend("topleft", legend = c("Likelihood", "maximum"), col = c("black", "red"), lwd = 1)
abline(v=parametervalues[which.max(loglik_profile)], col = "red")
```
The MLE is p=25/42=0.58 (red line).
## The prior
Let's first assume we know nothing about how much landmass and watermass there is. We define an uniformative (flat/uniform) prior for the probability p:
# Set up the prior function
```{r}
prior_unif <- function(p){
prior <- dunif(p, min=0, max=1)
return(prior)
}
prior <- prior_unif(parametervalues)
```
## The posterior
Remember that posterior = (likelihood * prior) / marginal likelihood.
The marginal likelihood (=normalisation constant) is the weighted average of the prior times the log-likelihood function over all possible values of theta.
```{r}
marg_loglik <- sum(loglik_profile * prior) / length(parametervalues)
```
In the analytical solution, we evaluate the log-likelihood function at every potential value of the parameter (p), hence the posterior is:
```{r}
posterior <- loglik_profile * prior / marg_loglik
plot(parametervalues, posterior, col = "darkgreen", type = "l")
lines(parametervalues, loglik_profile)
lines(parametervalues, prior, col = "red" )
legend("topleft", c("likelihood", "prior", "posterior"), col = c("black", "red", "green"), lwd = 1 )
```
Since our prior is flat, the posterior has the same shape as the log-likelihood profile. However, the area under the curve of the posterior is now 1 (i.e. it integrates to one), because it is now a proper probability distribution for one parameter.
## Rethinking the prior
But actually we know from some previous geographical measurements ("literature") that the water mass is roughly 71%.
Let's change the prior to a narrow Beta distribution with Beta(75,25) with a mean of 0.75:
```{r}
prior_beta <- function(p, shape1, shape2){
prior <- dbeta(p, shape1 = shape1, shape2 = shape2)
return(prior)
}
prior <- prior_beta(parametervalues, 75, 25)
```
Update the marginal loglikelihood and recalculate the prior:
```{r}
marg_loglik <- sum(loglik_profile * prior) / length(parametervalues)
posterior <- loglik_profile * prior / marg_loglik
plot(parametervalues, posterior, col = "darkgreen", type = "l")
lines(parametervalues, loglik_profile)
lines(parametervalues, prior, col = "red" )
legend("topleft", c("likelihood", "prior", "posterior"), col = c("black", "red", "green"), lwd = 1 )
```
The prior has pulled the posterior away from the log-likelihood because it is a tight prior distribution.
We can repeat it with a wider Beta prior Beta(15,5), which has the same mean, but a larger variance:
```{r}
prior <- prior_beta(parametervalues, 15, 5)
```
```{r}
marg_loglik <- sum(loglik_profile * prior) / length(parametervalues)
posterior <- loglik_profile * prior / marg_loglik
plot(parametervalues, posterior, col = "darkgreen", type = "l")
lines(parametervalues, loglik_profile)
lines(parametervalues, prior, col = "red" )
legend("topleft", c("likelihood", "prior", "posterior"), col = c("black", "red", "green"), lwd = 1 )
```
As you can see, in this case the prior did not have much influence on the data. The take-home message here is that you have to think carefully about the prior distributions, and how they are supported by literature.
## Disclaimer
This example is based on Richard McElreath's "globe tossing" example in "Statistical Rethinking" (3rd edition), see page 45. The code is adapted from [Florian Hartig](https://github.com/florianhartig/LearningBayes/blob/master/CommentedCode/01-Principles/InferenceMethods.md).