Implementation of inverse trigamma#415
Conversation
Codecov ReportBase: 93.63% // Head: 93.61% // Decreases project coverage by
Additional details and impacted files@@ Coverage Diff @@
## master #415 +/- ##
==========================================
- Coverage 93.63% 93.61% -0.03%
==========================================
Files 12 12
Lines 2767 2789 +22
==========================================
+ Hits 2591 2611 +20
- Misses 176 178 +2
Flags with carried forward coverage won't be shown. Click here to find out more.
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
ararslan
left a comment
There was a problem hiding this comment.
Thanks for the contribution and apologies for the delayed review! This is looking good overall, just some comments and questions. Also, I believe you'll need this (or something like it) in order for e.g. Float32 input to work:
diff --git a/src/gamma.jl b/src/gamma.jl
index 00fea79..ba7e5d8 100644
--- a/src/gamma.jl
+++ b/src/gamma.jl
@@ -515,7 +515,7 @@ for T in (Float16, Float32)
@eval f64(x::Complex{$T}) = Complex{Float64}(x)
@eval f64(x::$T) = Float64(x)
- for f in (:_digamma, :_trigamma, :_zeta, :_eta, :_invdigamma)
+ for f in (:_digamma, :_trigamma, :_zeta, :_eta, :_invdigamma, :_invtrigamma)
@eval $f(z::ComplexOrReal{$T}) = oftype(z, $f(f64(z)))
end
| invtrigamma(x) | ||
| Compute the inverse [`trigamma`](@ref) function of `x`. | ||
| """ | ||
| invtrigamma(y::Number) = _invtrigamma(float(y)) |
There was a problem hiding this comment.
| invtrigamma(y::Number) = _invtrigamma(float(y)) | |
| invtrigamma(x::Number) = _invtrigamma(float(x)) |
It's kind of breaking my brain that what we're calling x and y are the reverse of how they're used in the paper you linked.
| """ | ||
| invtrigamma(y::Number) = _invtrigamma(float(y)) | ||
|
|
||
| function _invtrigamma(y::Float64) |
There was a problem hiding this comment.
| function _invtrigamma(y::Float64) | |
| function _invtrigamma(x::Float64) |
(See above)
| invtrigamma(x) | ||
| Compute the inverse [`trigamma`](@ref) function of `x`. |
There was a problem hiding this comment.
| invtrigamma(x) | |
| Compute the inverse [`trigamma`](@ref) function of `x`. | |
| invtrigamma(x) | |
| Compute the inverse of the [`trigamma`](@ref) function at the positive real value `x`. | |
| This is the solution `y` to the equation `trigamma(y) = x`. |
The line break is for consistency with other docstrings. I realize the text is modified from that of invdigamma but in this case I think it's worth noting the restriction on the domain of x. The added line is just for some extra clarity.
| if y <= 0 | ||
| throw(DomainError(y, "Only positive `y` supported.")) | ||
| end | ||
|
|
||
| if y > 1e7 | ||
| return inv(sqrt(y)) | ||
| elseif y < 1e-6 | ||
| return inv(y) | ||
| end |
There was a problem hiding this comment.
| if y <= 0 | |
| throw(DomainError(y, "Only positive `y` supported.")) | |
| end | |
| if y > 1e7 | |
| return inv(sqrt(y)) | |
| elseif y < 1e-6 | |
| return inv(y) | |
| end | |
| if x <= 0 | |
| throw(DomainError(x, "`x` must be positive.")) | |
| elseif x > 1e7 | |
| return inv(sqrt(x)) | |
| elseif x < 1e-6 | |
| return inv(x) | |
| end |
This brings the error message more in line with the text used in other DomainError messages in this package, e.g. from the Bessel functions. Condensing the conditional is just for brevity.
| x_old = inv(y) + 0.5 | ||
| x_new = x_old | ||
|
|
||
| # Newton iteration | ||
| δ = Inf | ||
| iteration = 0 | ||
| while δ > 1e-8 && iteration <= 25 | ||
| iteration += 1 | ||
| f_x_old = trigamma(x_old) | ||
| δx = f_x_old*(1-f_x_old/y) / polygamma(2, x_old) | ||
| x_new = x_old + δx | ||
| δ = - δx / x_new | ||
| x_old = x_new | ||
| end | ||
|
|
||
| return x_new |
There was a problem hiding this comment.
| x_old = inv(y) + 0.5 | |
| x_new = x_old | |
| # Newton iteration | |
| δ = Inf | |
| iteration = 0 | |
| while δ > 1e-8 && iteration <= 25 | |
| iteration += 1 | |
| f_x_old = trigamma(x_old) | |
| δx = f_x_old*(1-f_x_old/y) / polygamma(2, x_old) | |
| x_new = x_old + δx | |
| δ = - δx / x_new | |
| x_old = x_new | |
| end | |
| return x_new | |
| # Newton iteration | |
| invx = inv(x) | |
| y_new = y_old = invx + 0.5 | |
| for _ in 1:26 | |
| ψ′ = trigamma(y_old) | |
| δ = ψ′ * (1 - ψ′ * invx) / polygamma(2, y_old) | |
| y_new = y_old + δ | |
| -δ / y_new < 1e-8 && break | |
| y_old = y_new | |
| end | |
| return y_new |
AFAICT this is equivalent and more directly maps to the paper, as it avoids introducing a second step size variable. You can also avoid dividing by the input at every iteration by inverting once then multiplying by the inverse.
I assume the number of iterations was chosen to match invdigamma. Do you know whether the algorithm generally converges within the given number of iterations and under what circumstances it may not? When it doesn't, how inaccurate is the result?
|
|
||
|
|
There was a problem hiding this comment.
Just some excess space
This PR implements the function
invtrigamma, which is the inverse oftrigamma.The implementation follows the Appendix "Inversion of Trigamma Function"
in "Linear Models and Empirical Bayes Methods for Assessing Differential Expression in Microarray Experiments"
by Gordon K. Smyth (2004). (The pdf is available, e.g., here.)
In implementing this I also tried to follow the
invdigammacode.