Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WeightedGroupBy #272

Merged
merged 76 commits into from
Apr 7, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
76 commits
Select commit Hold shift + click to select a range
ab63486
first pass at WeightedGroupBy
AdamOrmondroyd Mar 10, 2023
e4e834e
correct cov
AdamOrmondroyd Mar 10, 2023
94aecb9
remove duplicate cov
AdamOrmondroyd Mar 10, 2023
ada20d7
give up on cov for now
AdamOrmondroyd Mar 10, 2023
3abca58
use Lukas' test
AdamOrmondroyd Mar 10, 2023
626bce3
remove unecessary import
AdamOrmondroyd Mar 10, 2023
ca31333
remove currently unused lines from tests
AdamOrmondroyd Mar 10, 2023
9578b4b
version bump
AdamOrmondroyd Mar 10, 2023
8531a56
sort out docstrings
AdamOrmondroyd Mar 10, 2023
9598eac
fix indentation
AdamOrmondroyd Mar 10, 2023
192bffe
tests using cobaya chains
AdamOrmondroyd Mar 10, 2023
de9b4f7
test formatting
AdamOrmondroyd Mar 10, 2023
55e9f4e
reinstate median test
AdamOrmondroyd Mar 14, 2023
dfc4c3b
change numeric_only to None in median
AdamOrmondroyd Mar 14, 2023
244cd1f
stick underscores in front to see if this fixes the documentation
AdamOrmondroyd Mar 14, 2023
e3badbb
Revert "stick underscores in front to see if this fixes the documenta…
AdamOrmondroyd Mar 14, 2023
ca00255
add missing no cover to WeightedSeries.groupby()
AdamOrmondroyd Mar 14, 2023
d97b760
remove `:show-inheritance:` for `weighted_pandas` autodocs, cross ref…
lukashergt Mar 15, 2023
bfa2647
fix autodocs for `weighted_pandas`
lukashergt Mar 15, 2023
6703ca7
drop `WeightedGroupBy.kurtosis` also from tests
lukashergt Mar 15, 2023
40cd004
make `WeightedDataFramGroupBy` and `WeightedSeriesGroupBy` private, s…
lukashergt Mar 15, 2023
0f2104e
make `WeightedGroupBy.grouper` private
lukashergt Mar 15, 2023
87e77c4
Merge branch 'master' into groupby
AdamOrmondroyd Mar 16, 2023
9636d5f
version bump
AdamOrmondroyd Mar 16, 2023
6e86823
Merge branch 'master' into groupby
AdamOrmondroyd Mar 20, 2023
0a83fe0
version bump
AdamOrmondroyd Mar 20, 2023
8c907f1
Removed hard-coded numeric_only arguments
williamjameshandley Mar 22, 2023
eac682a
version bump
williamjameshandley Mar 22, 2023
2721c19
Merge branch 'master' into groupby
williamjameshandley Mar 22, 2023
c495362
Updated weighted samples
williamjameshandley Mar 22, 2023
7fa1cec
Completed coverage
williamjameshandley Mar 22, 2023
99a52e3
add missing space before inline comment
AdamOrmondroyd Mar 22, 2023
2e6d54e
joint call of column name and label
AdamOrmondroyd Mar 22, 2023
e45aaa6
formatting
AdamOrmondroyd Mar 22, 2023
5725ddf
additional chains.get_group(chains) tests
AdamOrmondroyd Mar 22, 2023
788fa84
added kurtosis, kurt, skew, mad, sem
williamjameshandley Mar 23, 2023
f22a24c
fix docs for weighted groupby sample methods
lukashergt Mar 24, 2023
d2215a5
complete coverage by adding test for `WeightedSeriesGroupBy.sample`
lukashergt Mar 24, 2023
866b3b3
fix groupby test for `WeightedSeriesGroupBy.sample`
lukashergt Mar 24, 2023
57a1c1f
add quantile
AdamOrmondroyd Mar 27, 2023
aff1455
add tests for corr, line 1441 causing invalid value warning
AdamOrmondroyd Mar 27, 2023
83e2c4d
add test for cov
AdamOrmondroyd Mar 27, 2023
0654d98
move quantile to end
AdamOrmondroyd Mar 27, 2023
43f0882
add test for corrwith
AdamOrmondroyd Mar 27, 2023
f1c966d
change `i` to `mask` to make it clearer that this is not a single ind…
lukashergt Mar 27, 2023
142740c
add tests that check whether `groupby` results from `mean`, `std`, `c…
lukashergt Mar 27, 2023
7b0a8e1
add groupby tests for `mad`, `corr`, `cov` and `corrwith` that check …
lukashergt Mar 27, 2023
911f54e
add tests for groupby that explicitly check that the methods return t…
lukashergt Mar 27, 2023
23f2d3d
Added some cleaner tests for get_group
williamjameshandley Mar 28, 2023
c5391a5
Merge branch 'groupby' of github.com:Ormorod/anesthetic into groupby
williamjameshandley Mar 28, 2023
d6423aa
partial completion of covariance
williamjameshandley Mar 29, 2023
706d759
Now using rather than
williamjameshandley Mar 29, 2023
bf07118
Added a wrapper for cov, corr, corrwith
williamjameshandley Mar 29, 2023
17d4332
corr and cov now working
williamjameshandley Mar 29, 2023
a71151e
reduced code repetition
williamjameshandley Mar 29, 2023
2935434
corrwith
williamjameshandley Mar 29, 2023
93b06a0
Corrections to two extra functions
williamjameshandley Mar 29, 2023
0655a9c
skipna no longer available for cov
williamjameshandley Mar 29, 2023
33dd6e0
Completed coverage with new nan
williamjameshandley Mar 30, 2023
5ec1fec
Increase coverage
williamjameshandley Mar 30, 2023
5113b61
add test for groupby().hist()
AdamOrmondroyd Mar 30, 2023
918986c
add test for groupby().plot.hist(), not happy with the janky slicing …
AdamOrmondroyd Mar 30, 2023
b13b0a2
add test for groupby().plot.kde()
AdamOrmondroyd Mar 31, 2023
9709b38
add tests for hist_1d and kde_1d
AdamOrmondroyd Mar 31, 2023
338fc8a
test for fastkde_1d
AdamOrmondroyd Mar 31, 2023
0e2676a
test for hist_2d
AdamOrmondroyd Mar 31, 2023
9589921
plt.close('all')
AdamOrmondroyd Mar 31, 2023
4e8e50f
test for kde_2d
AdamOrmondroyd Mar 31, 2023
a631d60
test for fastkde_2d
AdamOrmondroyd Mar 31, 2023
132d80f
Reinstated init function to get documentation to work
williamjameshandley Apr 4, 2023
369c49c
complete test coverage for explicit weight checks
lukashergt Apr 4, 2023
122bf2b
Readme correction following #217
williamjameshandley Apr 5, 2023
6d686f6
Merge branch 'groupby' of github.com:Ormorod/anesthetic into groupby
williamjameshandley Apr 5, 2023
5a5f106
fix `GelmanRubin` method now that `groupby` is fixed
lukashergt Apr 7, 2023
9d10ce5
add test for `LinAlgError` when covariance matrix is not positive def…
lukashergt Apr 7, 2023
85fa6ae
make linear dependence more blatant in check for `LinAlgError`
lukashergt Apr 7, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 0 additions & 1 deletion .github/workflows/CI.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,6 @@ jobs:
- name: Upgrade pip and install doc requirements
run: |
python -m pip install --upgrade pip
python -m pip install pip-tools
python -m pip install -e ".[extras,docs]"
- name: build documentation
run: |
Expand Down
6 changes: 3 additions & 3 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
anesthetic: nested sampling post-processing
===========================================
:Authors: Will Handley and Lukas Hergt
:Version: 2.0.0-beta.25
:Version: 2.0.0-beta.26
:Homepage: https://github.com/williamjameshandley/anesthetic
:Documentation: http://anesthetic.readthedocs.io/

Expand Down Expand Up @@ -191,8 +191,8 @@ Why create another one? In general, any dedicated user of software will find tha

.. code:: python

from anesthetic import MCMCSamples
samples = MCMCSamples(root=file_root) # Load the samples
from anesthetic import read_chains
samples = read_chains(file_root) # Load the samples
samples['omegab'] = samples.omegabh2/(samples.H0/100)**2 # Define omegab
samples.tex['omegab'] = '$\Omega_b$' # Label omegab
samples.plot_1d('omegab') # Simple 1D plot
Expand Down
2 changes: 1 addition & 1 deletion anesthetic/_version.py
Original file line number Diff line number Diff line change
@@ -1 +1 @@
__version__ = '2.0.0b25'
__version__ = '2.0.0b26'
37 changes: 22 additions & 15 deletions anesthetic/samples.py
Original file line number Diff line number Diff line change
Expand Up @@ -515,7 +515,7 @@ def remove_burn_in(self, burn_in, reset_index=False, inplace=False):
Indicates whether to modify the existing array or return a copy.

"""
chains = self.groupby(('chain', '$n_\\mathrm{chain}$'),
chains = self.groupby(('chain', '$n_\\mathrm{chain}$'), sort=False,
group_keys=False)
nchains = chains.ngroups
if isinstance(burn_in, (int, float)):
Expand Down Expand Up @@ -574,25 +574,32 @@ def Gelman_Rubin(self, params=None):
and 'logL' not in key
and 'chain' not in key]
chains = self[params+['chain']].groupby(
('chain', '$n_\\mathrm{chain}$')
('chain', '$n_\\mathrm{chain}$'), sort=False,
)
nchains = chains.ngroups

# Within chain variance ``W``
# (average variance within each chain):
W = chains.cov().groupby(level=['params', 'labels']).mean().to_numpy()
# TODO: the above line should be a weighted mean
# --> need to fix groupby for WeightedDataFrames!

W = chains.cov().groupby(level=('params', 'labels'), sort=False).mean()
# Between-chain variance ``B``
# (variance of the chain means compared to the full mean):
means_diff = (chains.mean() - self[params].mean()).to_numpy()
B = (means_diff.T @ means_diff) / (chains.ngroups - 1)
# B = chains.mean().cov().to_numpy()
# TODO: fix once groupby is fixed

L = np.linalg.cholesky(W)
invL = np.linalg.inv(L)
D = np.linalg.eigvalsh(invL @ B @ invL.T)
# (variance of the chain means):
B = np.atleast_2d(np.cov(chains.mean().T, ddof=1))
# We don't weight `B` with the effective number of samples (sum of the
# weights), here, because we want to notice outliers from shorter
# chains.
# In order to be conservative, we generally want to underestimate `W`
# and overestimate `B`, since `W` goes in the denominator and `B` in
# the numerator of the Gelman--Rubin statistic `Rminus1`.

try:
invL = np.linalg.inv(np.linalg.cholesky(W))
except np.linalg.LinAlgError as e:
raise np.linalg.LinAlgError(
"Make sure you do not have linearly dependent parameters, "
"e.g. having both `As` and `A=1e9*As` causes trouble.") from e
D = np.linalg.eigvalsh(invL @ ((nchains+1)/nchains * B) @ invL.T)
# The factor of `(nchains+1)/nchains` accounts for the additional
# uncertainty from using a finite number of chains.
Rminus1 = np.max(np.abs(D))
return Rminus1

Expand Down
11 changes: 6 additions & 5 deletions anesthetic/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -530,8 +530,9 @@ class to adjust
"""
for key, val in cls.__dict__.items():
doc = inspect.getdoc(val)
newdoc = re.sub(pattern, repl, doc, *args, **kwargs)
try:
cls.__dict__[key].__doc__ = newdoc
except AttributeError:
pass
if doc is not None:
newdoc = re.sub(pattern, repl, doc, *args, **kwargs)
try:
cls.__dict__[key].__doc__ = newdoc
except AttributeError:
pass
Loading