Releases: pymc-devs/pymc
Releases · pymc-devs/pymc
v3.6
This is a major new release from 3.5 with many new features and important bugfixes. The highlight is certainly our completely revamped website: https://docs.pymc.io/
Note also, that this release will be the last to be compatible with Python 2. Thanks to all contributors!
New features
- Track the model log-likelihood as a sampler stat for NUTS and HMC samplers
(accessible astrace.get_sampler_stats('model_logp')
) (#3134) - Add Incomplete Beta function
incomplete_beta(a, b, value)
- Add log CDF functions to continuous distributions:
Beta
,Cauchy
,ExGaussian
,Exponential
,Flat
,Gumbel
,HalfCauchy
,HalfFlat
,HalfNormal
,Laplace
,Logistic
,Lognormal
,Normal
,Pareto
,StudentT
,Triangular
,Uniform
,Wald
,Weibull
. - Behavior of
sample_posterior_predictive
is now to produce posterior predictive samples, in order, from all values of thetrace
. Previously, by default it would produce 1 chain worth of samples, using a random selection from thetrace
(#3212) - Show diagnostics for initial energy errors in HMC and NUTS.
- PR #3273 has added the
distributions.distribution._DrawValuesContext
context
manager. This is used to store the values already drawn in nestedrandom
anddraw_values
calls, enablingdraw_values
to draw samples from the
joint probability distribution of RVs and not the marginals. Custom
distributions that must calldraw_values
several times in theirrandom
method, or that invoke many calls to other distribution'srandom
methods
(e.g. mixtures) must do all of these calls under the same_DrawValuesContext
context manager instance. If they do not, the conditional relations between
the distribution's parameters could be broken, andrandom
could return
values drawn from an incorrect distribution. Rice
distribution is now defined with either the noncentrality parameter or the shape parameter (#3287).
Maintenance
- Big rewrite of documentation (#3275)
- Fixed Triangular distribution
c
attribute handling inrandom
and updated sample codes for consistency (#3225) - Refactor SMC and properly compute marginal likelihood (#3124)
- Removed use of deprecated
ymin
keyword in matplotlib'sAxes.set_ylim
(#3279) - Fix for #3210. Now
distribution.draw_values(params)
, will draw theparams
values from their joint probability distribution and not from combinations of their marginals (Refer to PR #3273). - Removed dependence on pandas-datareader for retrieving Yahoo Finance data in examples (#3262)
- Rewrote
Multinomial._random
method to better handle shape broadcasting (#3271) - Fixed
Rice
distribution, which inconsistently mixed two parametrizations (#3286). Rice
distribution now accepts multiple parameters and observations and is usable with NUTS (#3289).sample_posterior_predictive
no longer callsdraw_values
to initialize the shape of the ppc trace. This called could lead toValueError
's when sampling the ppc from a model withFlat
orHalfFlat
prior distributions (Fix issue #3294).
Deprecations
- Renamed
sample_ppc()
andsample_ppc_w()
tosample_posterior_predictive()
andsample_posterior_predictive_w()
, respectively.
v3.5 Final
New features
- Add documentation section on survival analysis and censored data models
- Add
check_test_point
method topm.Model
- Add
Ordered
Transformation andOrderedLogistic
distribution - Add
Chain
transformation - Improve error message
Mass matrix contains zeros on the diagonal. Some derivatives might always be zero
during tuning ofpm.sample
- Improve error message
NaN occurred in optimization.
during ADVI - Save and load traces without
pickle
usingpm.save_trace
andpm.load_trace
- Add
Kumaraswamy
distribution - Add
TruncatedNormal
distribution - Rewrite parallel sampling of multiple chains on py3. This resolves
long standing issues when transferring large traces to the main process,
avoids pickling issues on UNIX, and allows us to show a progress bar
for all chains. If parallel sampling is interrupted, we now return
partial results. - Add
sample_prior_predictive
which allows for efficient sampling from
the unconditioned model. - SMC: remove experimental warning, allow sampling using
sample
, reduce autocorrelation from
final trace. - Add
model_to_graphviz
(which uses the optional dependencygraphviz
) to
plot a directed graph of a PyMC3 model using plate notation. - Add beta-ELBO variational inference as in beta-VAE model (Christopher P. Burgess et al. NIPS, 2017)
- Add
__dir__
toSingleGroupApproximation
to improve autocompletion in interactive environments
Fixes
- Fixed grammar in divergence warning, previously
There were 1 divergences ...
could be raised. - Fixed
KeyError
raised when only subset of variables are specified to be recorded in the trace. - Removed unused
repeat=None
arguments from allrandom()
methods in distributions. - Deprecated the
sigma
argument inMarginalSparse.marginal_likelihood
in favor ofnoise
- Fixed unexpected behavior in
random
. Now therandom
functionality is more robust and will work better forsample_prior
when that is implemented. - Fixed
scale_cost_to_minibatch
behaviour, previously this was not working and alwaysFalse
v3.4.1 Final
There was no 3.4 release due to a naming issue on PyPI.
New features
- Add
logit_p
keyword topm.Bernoulli
, so that users can specify the logit of the success probability. This is faster and more stable than usingp=tt.nnet.sigmoid(logit_p)
. - Add
random
keyword topm.DensityDist
thus enabling users to pass custom random method which in turn makes sampling from aDensityDist
possible. - Effective sample size computation is updated. The estimation uses Geyer's initial positive sequence, which no longer truncates the autocorrelation series inaccurately.
pm.diagnostics.effective_n
now can reports N_eff>N. - Added
KroneckerNormal
distribution and a correspondingMarginalKron
Gaussian Process implementation for efficient inference, along with
lower-level functions such ascartesian
andkronecker
products. - Added
Coregion
covariance function. - Add new 'pairplot' function, for plotting scatter or hexbin matrices of sampled parameters.
Optionally it can plot divergences. - Plots of discrete distributions in the docstrings
- Add logitnormal distribution
- Densityplot: add support for discrete variables
- Fix the Binomial likelihood in
.glm.families.Binomial
, with the flexibility of specifying then
. - Add
offset
kwarg to.glm
. - Changed the
compare
function to accept a dictionary of model-trace pairs instead of two separate lists of models and traces. - add test and support for creating multivariate mixture and mixture of mixtures
distribution.draw_values
, now is also able to draw values from conditionally dependent RVs, such as autotransformed RVs (Refer to PR #2902).
Fixes
VonMises
does not overflow for large values of kappa. i0 and i1 have been removed and we now use log_i0 to compute the logp.- The bandwidth for KDE plots is computed using a modified version of Scott's rule. The new version uses entropy instead of standard deviation. This works better for multimodal distributions. Functions using KDE plots has a new argument
bw
controlling the bandwidth. - fix PyMC3 variable is not replaced if provided in more_replacements (#2890)
- Fix for issue #2900. For many situations, named node-inputs do not have a
random
method, while some intermediate node may have it. This meant that if the named node-input at the leaf of the graph did not have a fixed value,theano
would try to compile it and fail to find inputs, raising atheano.gof.fg.MissingInputError
. This was fixed by going through the theano variable's owner inputs graph, trying to get intermediate named-nodes values if the leafs had failed. - In
distribution.draw_values
, some named nodes could betheano.tensor.TensorConstant
s ortheano.tensor.sharedvar.SharedVariable
s. Nevertheless, indistribution._draw_value
, these would be passed todistribution._compile_theano_function
as if they weretheano.tensor.TensorVariable
s. This could lead to the following exceptionsTypeError: ('Constants not allowed in param list', ...)
orTypeError: Cannot use a shared variable (...)
. The fix was to not addtheano.tensor.TensorConstant
ortheano.tensor.sharedvar.SharedVariable
named nodes into thegivens
dict that could be used indistribution._compile_theano_function
. - Exponential support changed to include zero values.
Deprecations
- DIC and BPIC calculations have been removed
- df_summary have been removed, use summary instead
njobs
andnchains
kwarg are deprecated in favor ofcores
andchains
forsample
lag
kwarg inpm.stats.autocorr
andpm.stats.autocov
is deprecated.
v3.3 Final
New features
- Improve NUTS initialization
advi+adapt_diag_grad
and addjitter+adapt_diag_grad
(#2643) - Added
MatrixNormal
class for representing vectors of multivariate normal variables - Implemented
HalfStudentT
distribution - New benchmark suite added (see http://pandas.pydata.org/speed/pymc3/)
- Generalized random seed types
- Update loo, new improved algorithm (#2730)
- New CSG (Constant Stochastic Gradient) approximate posterior sampling algorithm (#2544)
- Michael Osthege added support for population-samplers and implemented differential evolution metropolis (
DEMetropolis
). For models with correlated dimensions that can not use gradient-based samplers, theDEMetropolis
sampler can give higher effective sampling rates. (also see PR#2735) - Forestplot supports multiple traces (#2736)
- Add new plot, densityplot (#2741)
- DIC and BPIC calculations have been deprecated
- Refactor HMC and implemented new warning system (#2677, #2808)
Fixes
- Fixed
compareplot
to useloo
output. - Improved
posteriorplot
to scale fonts sample_ppc_w
now broadcastsdf_summary
function renamed tosummary
- Add test for
model.logp_array
andmodel.bijection
(#2724) - Fixed
sample_ppc
andsample_ppc_w
to iterate all chains(#2633, #2748) - Add Bayesian R2 score (for GLMs)
stats.r2_score
(#2696) and test (#2729). - SMC works with transformed variables (#2755)
- Speedup OPVI (#2759)
- Multiple minor fixes and improvements in the docs (#2775, #2786, #2787, #2789, #2790, #2794, #2799, #2809)
Deprecations
- Old (
minibatch-
)advi
is removed (#2781)
v3.2 Final
- This version includes two major contributions from our Google Summer of Code 2017 students:
- Maxim Kochurov extended and refactored the variational inference module. This primarily adds two important classes, representing operator variational inference (
OPVI
) objects andApproximation
objects. These make it easier to extend existingvariational
classes, and to derive inference fromvariational
optimizations, respectively. Thevariational
module now also includes normalizing flows (NFVI
). - Bill Engels added an extensive new Gaussian processes (
gp
) module. Standard GPs can be specified using eitherLatent
orMarginal
classes, depending on the nature of the underlying function. A Student-T processTP
has been added. In order to accomodate larger datasets, approximate marginal Gaussian processes (MarginalSparse
) have been added.
- Maxim Kochurov extended and refactored the variational inference module. This primarily adds two important classes, representing operator variational inference (
- Documentation has been improved as the result of the project's monthly "docathons".
- An experimental stochastic gradient Fisher scoring (
SGFS
) sampling step method has been added. - The API for
find_MAP
was enhanced. - SMC now estimates the marginal likelihood.
- Added
Logistic
andHalfFlat
distributions to set of continuous distributions. - Bayesian fraction of missing information (
bfmi
) function added tostats
. - Enhancements to
compareplot
added. - QuadPotential adaptation has been implemented.
- Script added to build and deploy documentation.
- MAP estimates now available for transformed and non-transformed variables.
- The
Constant
variable class has been deprecated, and will be removed in 3.3. - DIC and BPIC calculations have been sped up.
- Arrays are now accepted as arguments for the
Bound
class. random
method was added to theWishart
andLKJCorr
distributions.- Progress bars have been added to LOO and WAIC calculations.
- All example notebooks updated to reflect changes in API since 3.1.
- Parts of the test suite have been refactored.
Fixes
- Fixed sampler stats error in NUTS for non-RAM backends
- Matplotlib is no longer a hard dependency, making it easier to use in settings where installing Matplotlib is problematic. PyMC will only complain if plotting is attempted.
- Several bugs in the Gaussian process covariance were fixed.
- All chains are now used to calculate WAIC and LOO.
- AR(1) log-likelihood function has been fixed.
- Slice sampler fixed to sample from 1D conditionals.
- Several docstring fixes.
v3.1 Final
This is the first major update to PyMC 3 since its initial release. Highlights of this release include:
- Gaussian Process submodule
- Much improved variational inference support that includes:
- Stein Variational Gradient Descent
- Minibatch processing
- Additional optimizers, including ADAM
- Experimental operational variational inference (OPVI)
- Full-rank ADVI
- MvNormal supports Cholesky Decomposition now for increased speed and numerical stability.
- NUTS implementation now matches current Stan implementation.
- Higher-order integrators for HMC
- Elliptical slice sampler is now available
- Added
Approximation
class and the ability to convert a sampled trace into an approximation via itsEmpirical
subclass. - Add MvGaussianRandomWalk and MvStudentTRandomWalk distributions.
v3.0 Final
This is the first major release of PyMC3. A number of major changes since splitting from the PyMC2 project include:
- Added gradient-based MCMC samplers: Hamiltonian MC (
HMC
) and No-U-Turn Sampler (NUTS
) - Automatic gradient calculations using Theano
- Convenient generalized linear model specification using Patsy formulae
- Parallel sampling via
multiprocessing
- New model specification using context managers
- New Automatic Differentiation Variational InferenceAVDI (
ADVI
) allowing faster sampling thanHMC
for some problems. - Mini-batch ADVI
v3.0 Release Candidate 6
Sixth release candidate of PyMC3 3.0.
v3.0 Release Candidate 5
Fifth release candidate of PyMC3 3.0.
v3.0 Release Candidate 4
Fourth release candidate of PyMC3 3.0.