Closed
Description
Memory leak while using ADVI with np.ma.array for observed values
A minimal, self-contained, and reproducible example.
import numpy as np
import pymc3 as pm
# generate dataset
x_obs = np.random.normal(loc=37, scale=1, size=100000)
x_obs = np.ma.array(x_obs, mask=x_obs>37)
# define model
with pm.Model() as model:
z = pm.Normal('z', mu=0, sd=10)
x = pm.Normal('x', mu=z, sd=1, observed=x_obs)
# fit model
with model:
approx = pm.fit(n=10000, method='advi', obj_optimizer=pm.adam(learning_rate=0.1), obj_n_mc=2)
Warnings while defining the model
/usr/local/miniconda3/envs/doh/lib/python3.7/site-packages/pymc3/model.py:1266: UserWarning: Data in x contains missing values and will be automatically imputed from the sampling distribution.
warnings.warn(impute_message, UserWarning)
Additional information
There is no memory leak if either one of the following happens
- obj_n_mc=1
- np.array is used instead of np.ma.array for observed values.
Memory leak while using ADVI without np.ma.array for observed values
A minimal, self-contained, and reproducible example.
import numpy as np
import pymc3 as pm
import theano
# generate dataset
x_obs = np.random.normal(loc=37, scale=1, size=100000)
dim_num = 1000
idx_list = np.random.choice(dim_num, size=20000)
# define model
with pm.Model() as model:
z = pm.Normal('z', mu=0, sd=1, shape=dim_num)
x = pm.Normal('x', mu=theano.tensor.sum(z[idx_list]), sd=1, observed=x_obs)
# fit model
with model:
approx = pm.fit(n=10000, method='advi', obj_optimizer=pm.adam(learning_rate=0.1), obj_n_mc=2)
Additional information
Again, there is no memory leak if obj_n_mc=1. But it leaks (approximately) to the same extent when obj_n_mc=2 and when obj_n_mc=25.
Versions and main components
- PyMC3 Version: 3.6
- Theano Version: 1.0.3
- Python Version: 3.7.3
- Operating system: macOS
- How did you install PyMC3: conda
Metadata
Metadata
Assignees
Labels
No labels