Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About the fisher matrix of Normal Distribution #353

Open
Yingrui-Z opened this issue Apr 28, 2024 · 2 comments
Open

About the fisher matrix of Normal Distribution #353

Yingrui-Z opened this issue Apr 28, 2024 · 2 comments
Assignees

Comments

@Yingrui-Z
Copy link

Yingrui-Z commented Apr 28, 2024

Question 1:

The first partial derivative of the log of normal distribution (l=-log(N~(μ,σ^2))) with respect to the parameters μ and σ is

  • ∂l/∂μ = (μ - x)/σ^2
  • ∂l/∂σ = (σ^2 - (μ - x)^2)/σ^3 = 1/σ * (1-(μ - x)^2)/σ^2)

In ngboost, the corresponding implementation is:

def d_score(self, Y):
    D = np.zeros((len(Y), 2))
    D[:, 0] = (self.loc - Y) / self.var
    D[:, 1] = 1 - ((self.loc - Y) ** 2) / self.var
    return D

This raises a question: why is D[:, 1] set to 1 - ((self.loc - Y) ** 2) / self.var instead of 1/sqrt(self.var)*(1 - ((self.loc - Y) ** 2) / self.var)

Question 2:

Besides, based on the information from Wikipedia on Normal Distribution, the Fisher Information matrix for a normal distribution is defined as follows:

  • F[0,0] = 1\σ^2
  • F[1,1] = 2\σ^2.

However, in the ngboost implementation, specifically in the NormalLogScore class within normal.py, the code snippet is:

def metric(self):
    FI = np.zeros((self.var.shape[0], 2, 2))
    FI[:, 0, 0] = 1 / self.var
    FI[:, 1, 1] = 2
    return FI

This raises another question: why is FI[1, 1] set to 2 instead of 2\self.var as per the theoretical formula?

Could you please clarify this discrepancy? Thank you for your assistance.

@avati
Copy link
Collaborator

avati commented Apr 29, 2024

Both your questions revolve around using \mu, `\sigma^` parametrization for the distribution (for gradients and Fisher information).

In NGBoost the parametrization is \mu, \log \sigma^2. If you work out the math with this parametrization, you should see the expressions match the implementation in the code.

@Yingrui-Z
Copy link
Author

Thank you for your kind explanation!

I am delving into a probability density function (pdf) that is defined as follows:

p(x | a, b, c) = a * b * exp(-a * (x - c)) / (1 + exp(-a * (x - c))) ^ (b + 1)

Here, a, b, and c represent the parameters. Calculating the first-order derivatives and the Fisher Information Matrix for these parameters has proven to be exceptionally complex.

In contexts involving Normal distributions, transformations such as new_σ = log(σ^2) have significantly simplified calculations. However, given the complex multiplicative interactions between the parameters in this pdf, implementing similar transformations poses a challenge.

Could you offer any insights or suggestions on how to transform this pdf to simplify the formulation of the Fisher Information Matrix?

Both your questions revolve around using \mu, \sigma^ parametrization for the distribution (for gradients and Fisher information).

In NGBoost the parametrization is \mu, \log \sigma^2. If you work out the math with this parametrization, you should see the expressions match the implementation in the code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants