[DRAFT] Importance sampling #47

yallup · 2023-10-26T13:01:39Z

This PR implements importance sampling from a margarine flow. This is something like neural importance sampling only with nested sampling to do the information acquisition, neural importance nested sampling I guess?

Workflow as follows:

run_pypolychord.py - A copy of the standard pypolychord script, 4D gaussian, restricted to a hypercube prior for now for simplicity.
train_maf.py - Trains a margarine MAF on the polychord run.
importance.py - Uses the trained MAF to importance sample the original likelihood again. I've added am integrate function to the margarine.marginal_stats.calculate class to do this:

IS integral: 0.064 +/- 0.000
IS efficiency: 0.901
NS integral: 0.063 +/- 0.011
NS efficiency: 0.004

This may have broader use (provided I've done this right) as an afterburner to improve nested sampling error estimate for moderate dimension problems. To be discussed, and tested to see if this actually works as well as I claim!

codecov-commenter · 2023-10-26T13:07:59Z

Codecov Report

Attention: 2 lines in your changes are missing coverage. Please review.

Comparison is base (5fcb811) 81.07% compared to head (60fe6b0) 82.35%.

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files

@@            Coverage Diff             @@
##           master      #47      +/-   ##
==========================================
+ Coverage   81.07%   82.35%   +1.27%     
==========================================
  Files           5        5              
  Lines         539      578      +39     
==========================================
+ Hits          437      476      +39     
  Misses        102      102

Files	Coverage Δ
margarine/marginal_stats.py	`86.60% <96.15%> (+7.15%)`	⬆️

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

htjb · 2023-10-26T15:22:15Z

Hi @yallup, this looks very cool! We discussed a bit offline but to record some of my thoughts on this. My understanding is that you want to perform integration using importance sampling between the trained flow $\tilde{L}$ and the actual likelihood $L$ which amounts too (according to the code)

$Z = \int \frac{L(x)}{\tilde{L}(x)} \tilde{L}((x) dx = \mathcal{E} \bigg[ \frac{L(x)}{\tilde{L}((x)}\bigg] \approx \frac{1}{N} \sum_{i=1} \frac{L(x)}{\tilde{L}((x)}$

where $x \sim \tilde{L}(x)$. To me this looks a bit strange because there is no prior term and we discussed that there we might need to account for the prior offline. I think the prior would come in here

$Z = \int \frac{L(x)}{\tilde{L}(x)} \tilde{L}(x) \pi(x) dx \approx \frac{1}{N} \sum_{i=1} \frac{L(x)}{\tilde{L}(x)}\pi(x)$

but not 100% sure.

Am I correct in my interpretation of the efficiency as a measure of how accurate the flow is? We define the weights as $w = \frac{L(x)}{\tilde{L}((x)}$ and the efficiency as the ratio of $n_{eff}$ over the total number of samples. So if the flow was perfect then $n_{eff} = N$ and $\mathrm{eff} = N/N = 100%$.

It's not in any way a measure of how efficient the nested sampling run is? Which is what we get when we calculate $n_{eff}$ with the nested sampling weights and divide by the number of samples drawn. This last point is relevant to some other discussions we have been having offline where the sampling efficiency of a nested sampling run is a measure of how quickly it reaches the posterior bulk --> useful for working out whether you have made a good prior choice.

margarine/marginal_stats.py

htjb · 2023-10-30T09:20:13Z

Could we move run_polychord.py, train.py and importance.py into the tutorial notebook in the notebook/ folder please?

Ohh and write a test to check that the evidence recovered from importance.py is equivalent to the value from run_polychord.py? Thanks!

yallup · 2023-11-01T12:03:36Z

Could we move run_polychord.py, train.py and importance.py into the tutorial notebook in the notebook/ folder please?

Ohh and write a test to check that the evidence recovered from importance.py is equivalent to the value from run_polychord.py? Thanks!

Test added and files removed. Wasn't sure how you wanted to approach documenting this in the tutorial so I just left that for now (probably to be added after we decide if this works)

htjb · 2023-11-08T16:25:04Z

@yallup sure no worries we can add a tutorial later down the line. Can we bump the version number to 1.2.0 here? Needs doing in readme and setup.py.

I just changed the KL divergence function so that it returns a dictionary rather than a pandas table. Bit nicer to use and more consistent with the new importance sampling function you have added here.

htjb · 2023-11-08T16:26:39Z

Ohh I think master branch might need merging here too...

htjb

Looks good to me. Go ahead and squash and merge when you get a chance! Thanks @yallup! 🚀

margarine/marginal_stats.py

yallup added 3 commits October 26, 2023 12:47

sketch importance sampling

8d71f7f

simpler than I thought?

0811b58

add more stats

75d9a50

yallup requested review from htjb and williamjameshandley October 26, 2023 13:01

tidy up the run_pypoly

61d6707

try a prior density

23139cf

yallup commented Oct 27, 2023

View reviewed changes

margarine/marginal_stats.py Outdated Show resolved Hide resolved

yallup added 3 commits November 1, 2023 09:34

update to logspace

555f251

add test

4592da5

clean out files and rename a method

791cc40

modifying the kl stats function to return dictionary

82bfb77

htjb and others added 3 commits November 8, 2023 16:37

fixing broken kl and bmd tests

17dc588

Merge branch 'htjb:master' into importance

b9046e7

increment version number

60fe6b0

yallup removed the request for review from williamjameshandley November 10, 2023 11:26

htjb approved these changes Nov 21, 2023

View reviewed changes

margarine/marginal_stats.py Outdated Show resolved Hide resolved

yallup merged commit dff0971 into htjb:master Nov 24, 2023
4 checks passed

yallup deleted the importance branch November 24, 2023 09:33

htjb mentioned this pull request Nov 29, 2023

Updating the tutorial #49

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DRAFT] Importance sampling #47

[DRAFT] Importance sampling #47

yallup commented Oct 26, 2023

codecov-commenter commented Oct 26, 2023 •

edited

Loading

htjb commented Oct 26, 2023 •

edited

Loading

htjb commented Oct 30, 2023 •

edited

Loading

yallup commented Nov 1, 2023

htjb commented Nov 8, 2023

htjb commented Nov 8, 2023

htjb left a comment

[DRAFT] Importance sampling #47

[DRAFT] Importance sampling #47

Conversation

yallup commented Oct 26, 2023

codecov-commenter commented Oct 26, 2023 • edited Loading

Codecov Report

htjb commented Oct 26, 2023 • edited Loading

htjb commented Oct 30, 2023 • edited Loading

yallup commented Nov 1, 2023

htjb commented Nov 8, 2023

htjb commented Nov 8, 2023

htjb left a comment

Choose a reason for hiding this comment

codecov-commenter commented Oct 26, 2023 •

edited

Loading

htjb commented Oct 26, 2023 •

edited

Loading

htjb commented Oct 30, 2023 •

edited

Loading