Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementing tension statistics #333

Open
wants to merge 63 commits into
base: master
Choose a base branch
from
Open

Implementing tension statistics #333

wants to merge 63 commits into from

Conversation

DilyOng
Copy link
Collaborator

@DilyOng DilyOng commented Aug 24, 2023

Description

This is a work in progress pull request aiming to address #325 and as a learning exercise on how to do pull request.

Checklist:

  • I have performed a self-review of my own code
  • My code is PEP8 compliant (flake8 anesthetic tests)
  • My code contains compliant docstrings (pydocstyle --convention=numpy anesthetic)
  • New and existing unit tests pass locally with my changes (python -m pytest)
  • I have added tests that prove my fix is effective or that my feature works
  • I have appropriately incremented the semantic version number in both README.rst and anesthetic/_version.py

@codecov
Copy link

codecov bot commented Aug 24, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 100.00%. Comparing base (a4c8521) to head (ce82e13).

Additional details and impacted files
@@            Coverage Diff            @@
##            master      #333   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           36        37    +1     
  Lines         3069      3097   +28     
=========================================
+ Hits          3069      3097   +28     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@williamjameshandley
Copy link
Collaborator

Hi @DilyOng, many thanks for taking charge of incorporating this. Let's get it plumbed into anesthetic first, and then get feedback from others on if anything is missing.

At the moment, this code is specialised to a specific naming scheme (which is what the union and intersection functions are doing), and for a wider grid.

I think we should re-organise this so that in the first instance it is more similar to @AdamOrmondroyd's suspiciousness package, but retaining the class/cacheing structure of tension_calculator.

Tasks:

  • Create a class TensionCalculator in a new file anesthetic/tension.py (note CamelCase rather than under_score naming)
  • This class should __init__ with A, B and AB which are assumed to be NestedSamples, and cache self.A = A.stats(nsamples) (same for B and AB, which computes the nested sampling statistics (see also the docs)
  • This should then implement logR logS d D_KL and p
  • You should then create a tests/test_tension.py in the same style as the other test files, which uses anesthetic.examples.perfect_ns.correlated_gaussian functions to create mock A, B and AB, alongside equations 14 to 25 from 1902.04029 to test that the tension statitistics code gets the correct answer.

@williamjameshandley
Copy link
Collaborator

williamjameshandley commented Aug 24, 2023

I think after that it would also be good to implement a function in addition to (or possibly in place of!) the class for producing a Samples object containing columns of logR, D_KL, d, S, p, as an analogue to the output of NestedSamples.stats.

tension_calculator.py Outdated Show resolved Hide resolved
@AdamOrmondroyd
Copy link
Collaborator

Please remember to remove (git remove) the .DSstore files you've added (they are to do with macOS file management, so not relevant to the repo)

lukashergt and others added 12 commits September 29, 2023 17:20
…ation and testing it with correlated gaussian likelihoods. Found a problem with the function anesthetic.examples.perfect_ns.correlated_gaussian. The generated likelihood gaussian in the parameters is not normalised and the evidence is not unity. Need to take into account the LogLmax.
…ed_gaussian. Within the correlated_gaussian function, changed logLike function. Changed the function's description to match the fact that evidence is not unity.
…nction tension_stats() for calculating tension statistics. Rewrote the test_tension_stats.py in tests to match the format of other files. It tests mock datasets with guassian likelihood. Both compatiable and incompatiable datasets have passed the test.
…dd a file for datasets pairwise_comparison, but not completed
@williamjameshandley
Copy link
Collaborator

It would be good to get this finalised and merged now that #348 is complete -- any thoughts @DilyOng

DilyOng and others added 3 commits March 4, 2024 18:42
…the theoretical logR, logS and logI values sit within 3 std of the numerical solution's distribution from anesthetic, instead of testing between minimum and maximum values of the distribution.
anesthetic/tension.py Outdated Show resolved Hide resolved
anesthetic/tension.py Outdated Show resolved Hide resolved
anesthetic/tension_pvalue.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@williamjameshandley williamjameshandley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @DilyOng,

I'm happy for this to be merged now. The tests are 'failing' due to the numpy< 2.0 flag, which won't be resolved until we merge #388, which needs a fastkde upgrade/deprecation

Please press 'squash and merge'. Congrats on your first PR.

anesthetic/tension.py Outdated Show resolved Hide resolved
@williamjameshandley
Copy link
Collaborator

@AdamOrmondroyd needs to approve the changes in order for this to be merged

Copy link
Collaborator

@AdamOrmondroyd AdamOrmondroyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments inline

covAB = inv(inv(covA) + inv(covB))
meanAB = covAB@(solve(covA, meanA)+solve(covB, meanB))
dmeanAB = np.array(meanA)-np.array(meanB)
logLmaxAB = -1/2 * dmeanAB@solve(covA+covB, dmeanAB) + logLmaxA + logLmaxB
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing spaces around @

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


logS_std = samples_stats.logS.std()
logS_mean = samples_stats.logS.mean()
logS_exact = d/2 - 1/2*dmeanAB@solve(covA+covB, dmeanAB)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

bounds = [[-1, 1], [0, 3], [0, 1]]

meanA = [0.1, 0.3, 0.5]
covA = np.array([[.01, 0.009, 0],
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing 0 in 0.01 (there are a few of these, I'm technically on holiday and using my phone so you can find the rest)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment on lines +92 to +93
samples['logI'] = statsA['D_KL'] + statsB['D_KL'] - statsAB['D_KL']
samples.set_label('logI', r'$\ln\mathcal{I}$')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@williamjameshandley: The notation with logI matches the one in eq. (9) of Quantifying tensions in cosmological parameters. However, the Shannon information I_S = log(P/pi) already carries a logarithm in its notation, so I find this logI notation confusing. Wouldn't just I be more appropiate?

The notation logR = logS - I also matches better the Occam equation logZ = logL_P - D_KL...

Thoughts?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants