Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MNPE class similar to MNLE #1362

Open
wants to merge 14 commits into
base: main
Choose a base branch
from
Open

MNPE class similar to MNLE #1362

wants to merge 14 commits into from

Conversation

dgedon
Copy link
Collaborator

@dgedon dgedon commented Jan 10, 2025

Implementation of mixed NPE where we have some continuous parameters theta followed by one (or multiple with this PR #1269) discrete parameters. The observation space is fully continuous.

Deprecated mnle.py in net_builders und unified mnle/mnpe as mixed_nets.py.

@dgedon
Copy link
Collaborator Author

dgedon commented Jan 17, 2025

Update:

  • MNPE and tests are implemented
  • for test with Bernoulli prior, I had to change mcmc_transforms to handle discrete distirbution. As default we just compute mean/std for discrete distributions
  • currently MNPE with embedding nets does not work yet. Gives some backwards inplace operations error that I couldn't solve yet.

@janfb
Copy link
Contributor

janfb commented Feb 25, 2025

@dgedon #1269 is now merged 🙌

@dgedon
Copy link
Collaborator Author

dgedon commented Mar 18, 2025

Updates:

  • bug fix so everything works now. Essentially need to handle normalization with care when switching from mnle to mnpe
  • remove unnecessary GPU handling (hackathon task). This limits MultipleIndependent as not allowing device argument yet

Copy link
Contributor

@janfb janfb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good overall, except one central question about the call signature of MNPE.

@@ -11,10 +11,12 @@


class MixedDensityEstimator(ConditionalDensityEstimator):
"""Class performing Mixed Neural Likelihood Estimation.
"""Class performing Mixed Neural Density Estimation.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Comment on lines +134 to +138
assert isinstance(
density_estimator, MixedDensityEstimator
), f"""net must be of type MixedDensityEstimator but is {
type(density_estimator)
}."""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could also change the type above to be MixedDensityEstimator to have a static check.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I'll add this additionally to the assertion

Comment on lines 16 to 19
This estimator combines a Categorical net and a neural density estimator to model
data with mixed types (discrete and continuous), e.g., as they occur in
decision-making models. It can be used for both likelihood and posterior estimation
of mixed data.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
This estimator combines a Categorical net and a neural density estimator to model
data with mixed types (discrete and continuous), e.g., as they occur in
decision-making models. It can be used for both likelihood and posterior estimation
of mixed data.
This estimator combines a categorical mass estimator and a density estimator to model
variables with mixed types (discrete and continuous). It can be used for both likelihood
estimation (e.g., for discrete decisions and continuous reaction times in decision-making
models) or posterior estimation (e.g., for models that have both discrete and continuous
parameters).

"""The forward method is not implemented for MNLE, use '.sample(...)' to
generate samples though a forward pass."""
"""The forward method is not implemented for mixed neural density
estimation,use '.sample(...)' to generate samples though a forward
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
estimation,use '.sample(...)' to generate samples though a forward
estimation, use '.sample(...)' to generate samples though a forward

Comment on lines 228 to 257
def build_mnle(
batch_x: Tensor,
batch_y: Tensor,
**kwargs,
) -> MixedDensityEstimator:
"""Returns a mixed neural likelihood estimator.

This estimator models p(x|theta) where x contains both continuous and discrete data.

Args:
batch_x: Batch of xs (data), used to infer dimensionality.
batch_y: Batch of ys (parameters), used to infer dimensionality.
**kwargs: Additional arguments passed to _build_mixed_density_estimator.

Returns:
MixedDensityEstimator for MNLE.
"""
return _build_mixed_density_estimator(
batch_x=batch_x, batch_y=batch_y, mode="mnle", **kwargs
)


def build_mnpe(
batch_x: Tensor,
batch_y: Tensor,
**kwargs,
) -> MixedDensityEstimator:
"""Returns a mixed neural posterior estimator.

This estimator models p(theta|x) where x contains both continuous and discrete data.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am confused by these two call functions. maybe I am missing something, but couldn't we just call _build_mixed_density_estimator with batch_x and batch_y swapped for MNPE and MNLE?

To me it seems this swapping is not happening, i.e., we need to make sure that in MNPE we are only embedding x and not theta, and v.v. in MNLE. Let's discuss tomorrow.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, we could remove both functions and just use _build_mixed_density_estimator.

The swapping of x/theta is happening because the function is once called as likelihood_nn and once as posterior_nn.

@dgedon
Copy link
Collaborator Author

dgedon commented Mar 19, 2025

Update:

  • simplify _build_mixed_density_estimator by not having mode='mnpe'/'mnle'
  • add default log_transform_x as kwarg to build_mnle and build_mnpe

Copy link
Contributor

@janfb janfb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some more comments on the tests and the refactoring.

I am suggestion a toy example with ground-truth posterior for the MNPE scenario to test the accuracy.

Comment on lines 729 to 741
try:
prior_mean = prior.mean.to(device)
prior_std = prior.stddev.to(device)
except (NotImplementedError, AttributeError):
warnings.warn(
"The passed discrete prior has no mean or stddev attribute, "
"estimating them from samples to build affine standardizing "
"transform.",
stacklevel=2,
)
theta = prior.sample(torch.Size((num_prior_samples_for_zscoring,)))
prior_mean = theta.mean(dim=0).to(device)
prior_std = theta.std(dim=0).to(device)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move this into a small function to avoid code duplication?

x = mixed_param_simulator(theta)

# Build estimator manually
theta_embedding = FCEmbedding(1, 1) # simple embedding net, 1 continuous parameter
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be an x_embedding to avoid confusion

flow_model=flow_model,
z_score_theta=z_score_theta,
embedding_net=theta_embedding if use_embed_net else torch.nn.Identity(),
log_transform_x=False,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

log_transform_x=False,
)
trainer = MNPE(density_estimator=density_estimator)
trainer.append_simulations(theta, x).train(max_num_epochs=5)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

max_num_epochs=1

to speed up tests.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please remove all diffs here. we probably want to have some kind of tutorial or how-to-guide with MNPE, but let's wait for the new documentation setup.

Comment on lines +20 to +33
def mixed_param_simulator(theta: Tensor) -> Tensor:
"""Simulator for continuous data with mixed parameters.

Args:
theta: Parameters with mixed types - continuous and discrete.
Returns:
x: Continuous observation.
"""
device = theta.device

# Extract parameters
a, b = theta[:, 0], theta[:, 1]
noise = 0.05 * torch.randn(a.shape, device=device).reshape(-1, 1)
return (a + 2 * b).reshape(-1, 1) + noise
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we come up with a simulator for which we have ground truth posterior to make a test on the accuracy?

e.g., a mixture of Gaussians where the discrete params just selects the component:

z ~ Categorical
x ~ N(mu_z, 1)

and Gaussian priors on mu_z with different prior means, e.g., -1 and 1.

then for a fixed z, the posterior will just be a Gaussian composed from the likelihood and prior Gaussian just like in our standard linear Gaussian example.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants