title | date |
---|---|
Statistical Inference |
2023-02-22 |
Inference is about predicting an answer given an observation.
The goal of MLE is to find the optimal way to fit a distribution to the data, i.e. the best distribution (parameters) to maximise the probability of observing our data.
Take the dimensions of the data as independent of each other such that the joint probability of the feature set is broken up into the product of the probabilities of each feature. Naive Bayes is thus naive due to its indiscretion towards these combined probabilities, even though they may in fact be correlated to each other.
Linear Regression used the intuition of minimizing the least squared error from the data to the model to find the best fit line. Here we can see that finding the maximum likelihood also leads us to the same conclusion:
Finding the most likely distribution parameter, given the data.
- From the equation, we can see that MAP means maximising the product of the likelihood and the prior probability (some known information of the distribution).
With our parameters known, we can make classifications on new data. Will I play orienteering given the forecast? i.e. yes/no given that the new forecast is rainy.
Classify based on the highest likelihood
Classify based on the highest posterior probability
Take each feature as iid, will I go play orienteering given x?: $$ \begin{align} x&=(sunny, cool, high, true)\ P(y=yes|forecast=x)&= \frac{Pr(x|y=yes)P(y=yes)}{P(forecast=x)}\ argmax_{y\in Y}(P(yes|x))&=argmax_{y\in Y}(P(x|yes)P(yes))\ &=0.005\ argmax_{y\in Y}(P(no|x))&=argmax_{y\in Y}(P(x|no)P(yes))\ &=0.021\ y_{MAP}&=0.021=NO \end{align} $$
ML and MAP produce point estimates for