Reorganisation into wpandas and lpandas #280

williamjameshandley · 2023-04-09T10:21:07Z

Description

This arises from a conversation last summer with @lukashergt . If it is something that we still want to do, we should do it before anesthetic 2.0.0 release.

The idea here is to separate out the weighted pandas functionality from the anesthetic functionality.

The benefit of this is that it makes it easier to see (particularly when it comes to pandas plotting overrides) what is modifying pandas by adding weights (wpandas) or labels (lpandas), and what then adds to this to provide new plotting tools (anesthetic).

This PR would also be an opportunity to do any other general reorganisation, for example renaming anesthetic.samples to anesthetic.core, and moving anesthetic.{plot,boundary,kde} into anesthetic.plotting.

The downside is that by doing this tidy-up we will have to get used to new locations/names, and will break scripts that relied on anesthetic.samples, which is allowed when updating a major version, but not ideal when we've been in beta for so long. This will likely also cause grief for the in-progress #270, but I'm happy to help @AdamOrmondroyd merge/redo that if we agree that this PR is in principle a good idea.

Checklist:

I have performed a self-review of my own code
My code is PEP8 compliant (flake8 anesthetic tests)
My code contains compliant docstrings (pydocstyle --convention=numpy anesthetic)
New and existing unit tests pass locally with my changes (python -m pytest)
I have added tests that prove my fix is effective or that my feature works
I have appropriately incremented the semantic version number in both README.rst and anesthetic/_version.py

codecov · 2023-04-09T10:29:02Z

Codecov Report

Patch coverage: 100.00% and no project coverage change.

Comparison is base (08905f4) 100.00% compared to head (536a56f) 100.00%.

Additional details and impacted files

@@            Coverage Diff             @@
##            master      #280    +/-   ##
==========================================
  Coverage   100.00%   100.00%            
==========================================
  Files           30        25     -5     
  Lines         2643      1892   -751     
==========================================
- Hits          2643      1892   -751

Impacted Files	Coverage Δ
anesthetic/convert.py	`100.00% <ø> (ø)`
anesthetic/gui/plot.py	`100.00% <ø> (ø)`
anesthetic/read/chain.py	`100.00% <ø> (ø)`
anesthetic/__init__.py	`100.00% <100.00%> (ø)`
anesthetic/_version.py	`100.00% <100.00%> (ø)`
anesthetic/core.py	`100.00% <100.00%> (ø)`
anesthetic/examples/perfect_ns.py	`100.00% <100.00%> (ø)`
anesthetic/plot.py	`100.00% <100.00%> (ø)`
anesthetic/plotting/__init__.py	`100.00% <100.00%> (ø)`
anesthetic/plotting/_matplotlib/__init__.py	`100.00% <100.00%> (ø)`
... and 8 more

... and 4 files with indirect coverage changes

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

AdamOrmondroyd · 2023-04-09T16:44:00Z

I was already going to restart #270 after #272, as I will need to rethink the fixes for cov.

Hopefully this will also make rethinking labels more straightforward, so e.g. Samples.plot.hist() will use the proper labels, rather than the tuple looking things we have at the moment, though this may now require both lpandas/plotting/ and wpandas/plotting/

AdamOrmondroyd · 2023-04-10T14:15:37Z

Realised I'd missed the brackets in super() from #235, so have also corrected here.

williamjameshandley · 2023-04-11T19:09:27Z

@lukashergt what do you think about this suggestion:

This PR would also be an opportunity to do any other general reorganisation, for example renaming anesthetic.samples to anesthetic.core, and moving anesthetic.{plot,boundary,kde} into anesthetic.plotting.

Is there any other reorganisation you have considered in the past?

lukashergt · 2023-04-11T19:26:05Z

@lukashergt what do you think about this suggestion:

Looking through it at the moment...

On a general level, I like it, as it cleans up how things are built up on pandas and matplotlib...

The downside is that by doing this tidy-up we will have to get used to new locations/names, and will break scripts that relied on anesthetic.samples,

What type of scripts will break? The core functionalities using the NestedSamples methods should continue working as expected, right? After all, weighted_pandas was mostly hidden to the user under the hood of NestedSamples...

I think it will mostly break for anybody building on weighted_pandas directly, but for that type of user these changes should probably improve things in the long run...

williamjameshandley · 2023-04-11T19:53:41Z

What type of scripts will break?

I was more referring to if we e.g. rename anesthetic.samples to anesthetic.core, anybody who accesses e.g. merge_nested_samples from samples would now find this failing. I guess this could be accomplished by aliasing and raising a deprecation warning for anybody importing from anesthetic.samples.

williamjameshandley · 2023-04-11T20:15:23Z

I guess this could be accomplished by aliasing and raising a deprecation warning for anybody importing from anesthetic.samples.

fbf4a7a gives a revertable example of this.

lukashergt

See comments inline.

lukashergt · 2023-04-11T20:14:02Z

wpandas/core.py

Would it be helpful to start mirroring pandas more by separating the classes in here and putting them in the matching module names?

Then WeightedSeries and WheightedDataFrame would go in a file frame.py, and WeightedGroupBy would go in a subfolder groupby in a file groupby.py, and WeightedSeriesGroupBy and WeightedDataFrameGroupBy would go in the same subfolder groupby in a file generic.py...

The advantage would be that it makes clearer how/where to compare this to in pandas.

The downside is that it makes the architecture more (possibly too?) complicated...

OK, e9fba58 begins this procedure for wpandas. Before I go and do the same for both lpandas, anesthetic, and all the documentation related changes, are we happy with this level of reorganisation?

I was skeptical, but it does keep the files nice and small, the old weightedpandas.py was getting a bit large for my tastes.

However, if you really want to match the pandas layout, the tests are actually included within the pandas package under pandas/tests/, so I guess we should have anesthetic/tests/, wpandas/tests/ and lpandas/tests/ ?

Hmm, I'm on the fence. This does blow things up quite a bit. Find it hard to judge what will be easier to maintain. Do you think the matching file structure will make it easier to adapt to changes in pandas?

I'm not sure whether the same kind of matching is needed for lpandas. The weights are such a substantial addition that they really carry through everything, but I don't think the same goes for labels.

For anesthetic I think I'd prefer keeping things simple.

Sorry, I don't have a very clear line on this, and can be easily swayed either way. I think for wpandas it might help.

I think the lpandas structure should match (separate frame.py, series.py etc) but perhaps leave the tests where they are?

lukashergt · 2023-04-11T20:14:39Z

wpandas/__init__.py

Now that this becomes its proper subpackage, it would probably be good to give this a file docstring.

lukashergt · 2023-04-11T20:17:10Z

tests/test_weighted_pandas.py

I think the wpandas subpackage should be completely independent from anesthetic, so we should make sure its tests do not rely on anesthetic imports.

Should channel_capacity be a wpandas function?

Should we have test subfolders splitting tests for anesthetic and wpandas?

Ah, regarding the first point, channel_capacity is already moved to wpandas but imported in anesthetic.utils. We should use wpandas.utils in these tests instead.

Should we have test subfolders splitting tests for anesthetic and wpandas?

Yes. I will do this (after we've decided on the degree of reorganisation)

Should channel_capacity be a wpandas function?

We could just use the scipy function np.exp(-entropy(weights))

@htjb is considering a PR which updates neff to have both the kish and entropy 'method' options. Perhaps it would be better to simply replace channel_capacity with a less cryptically named function.

Just found out that Kish and entropy method are actually related, which is pretty cool and helped me understand where the cryptic channel capacity was actually coming from... See #285 (comment).

anesthetic/samples.py

…o wpandas_reorg

AdamOrmondroyd · 2023-04-20T21:49:37Z

I was having a go at playing with the docs to see if I could get wpandas to work, just to get my head around sphinx. but I can't work out how to easily show the contents of e.g. wpandas/init.py, so that the api is clear.

lukashergt · 2023-04-21T00:09:29Z

I was having a go at playing with the docs to see if I could get wpandas to work, just to get my head around sphinx. but I can't work out how to easily show the contents of e.g. wpandas/init.py, so that the api is clear.

There is an issue in that we now have multiple parallel packages in one, which makes the auto-generated sphinx documentation not as straightforward... Where previously we would need to run

sphinx-apidoc -fM -t docs/templates/ -o docs/source/ anesthetic/

this won't work anymore, we now have anesthetic/ and wpandas/, two packages in parallel.

This makes me question whether this is really what we want... Maybe this is all overkill. If we do want the separation, then the proper way might be to create a new repository for wpandas, completely independent from anesthetic. Thoughts?

AdamOrmondroyd · 2023-04-21T09:17:18Z

This makes me question whether this is really what we want... Maybe this is all overkill. If we do want the separation, then the proper way might be to create a new repository for wpandas, completely independent from anesthetic. Thoughts?

I think a separate wpandas repo really is overkill.

… make sure all three are in modules.rst

AdamOrmondroyd · 2023-04-24T11:19:11Z

@williamjameshandley @lukashergt are we still keen for this reorganisation? #282 depends on this, and I'm wary of falling behind pandas.

williamjameshandley · 2023-05-31T16:16:33Z

I think a separate wpandas repo really is overkill.

I agree.

@williamjameshandley @lukashergt are we still keen for this reorganisation?

Coming back to this I'm now also on the fence. @lukashergt do you want to cast a vote?

lukashergt · 2023-05-31T18:55:34Z

I think a separate wpandas repo really is overkill.

I agree.

@williamjameshandley @lukashergt are we still keen for this reorganisation?

Coming back to this I'm now also on the fence. @lukashergt do you want to cast a vote?

In many ways weighted, labelled data frames would deserve their own repository (or deserve being a pandas own thing), separate from anesthetic, but I agree that that is overkill for us. However, a wpandas subpackage as designed in this PR is on some levels (notably automated documentation) more complicated than a separate repository, and certainly more complicated than a wpandas module.

I am quite happy with the current weighted_pandas.py module, which has already come a long way from anesthetic 1.0.0, so my vote would go to leaving things the way they are for now. We can reflect on this again for a future anesthetic 3.0.0.

williamjameshandley · 2023-06-01T07:48:33Z

I am quite happy with the current weighted_pandas.py module, which has already come a long way from anesthetic 1.0.0, so my vote would go to leaving things the way they are for now. We can reflect on this again for a future anesthetic 3.0.0.

OK, let's leave it for now, and press ahead with the anesthetic 2.0.0 release.

AdamOrmondroyd · 2023-06-04T12:30:11Z

Just a couple of thoughts:

To butcher a classic quote: "premature modularity is the root of all evil". I agree that wpandas and lpandas, if they exist, would require their own repos. However, this would require three PRs every time e.g. pandas, matplotlib updates break something, and it's already a reasonable effort to keep up. I can also see it being confusing for new users to identify the cause of a bug if it lies in l/wpandas.

Modularity allows code to be reused. I think it is very unlikely that someone outside our circle would think or want to use l/wpandas, so until we have another project that would benefit from using them (perhaps the new C++ PolyChord could benefit from wpandas?), we should leave it as is.

williamjameshandley added 6 commits March 22, 2023 15:59

wpandas refactor

6980b13

Updated modules

23faf4a

Updated documentation

d4bf52c

Merge branch 'master' into wpandas_reorg

43bc825

Tests now pass

076cdd4

documentation update weighted_pandas -> wpandas

44f90d5

williamjameshandley added 2 commits April 9, 2023 11:31

Removed :show-inheritance: from wpandas to match master behaviour

a5bc9e7

Sphinx actually fixed now

f8c0ac5

missing brackets after super()

167b23e

williamjameshandley added this to the 2.0.0 milestone Apr 10, 2023

AdamOrmondroyd mentioned this pull request Apr 11, 2023

Changes for pandas 2.0.0 #282

Closed

15 tasks

williamjameshandley added 2 commits April 11, 2023 20:55

Merge branch 'master' into wpandas_reorg

0ea16a0

Moved anesthetic.samples to anesthetic.core

fbf4a7a

lukashergt reviewed Apr 11, 2023

View reviewed changes

AdamOrmondroyd and others added 9 commits April 11, 2023 22:34

Merge branch 'master' into wpandas_reorg

6fe2183

version bump

ca022cb

Refactored wpandas

e9fba58

Merge branch 'wpandas_reorg' of github.com:handley-lab/anesthetic int…

354f665

…o wpandas_reorg

remove show-inheritance from anesthetic.core to see if this fixes docs

8f37ed7

attempt to correct wpandas.utils docs

5611c58

put utils docs in right place

1f8c516

just get rid of wpandas utils

8e7548a

separate wpandas.core module

53f486d

reinstate :show-inheritance:

fe08bf1

htjb mentioned this pull request Apr 20, 2023

Extending channel_capacity() function #284

Closed

6 tasks

htjb mentioned this pull request Apr 21, 2023

Effective samples #285

Merged

6 tasks

AdamOrmondroyd added 2 commits April 21, 2023 11:53

run sphinx-apidoc on anesthetic, wpandas and lpandas separately, then…

a2c6d03

… make sure all three are in modules.rst

add wpandas.core.groupby.rst and wpandas.core.util.rst

536a56f

williamjameshandley closed this Jun 1, 2023

williamjameshandley deleted the wpandas_reorg branch June 14, 2023 20:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reorganisation into wpandas and lpandas #280

Reorganisation into wpandas and lpandas #280

williamjameshandley commented Apr 9, 2023

codecov bot commented Apr 9, 2023 •

edited

Loading

AdamOrmondroyd commented Apr 9, 2023

AdamOrmondroyd commented Apr 10, 2023

williamjameshandley commented Apr 11, 2023

lukashergt commented Apr 11, 2023 •

edited

Loading

williamjameshandley commented Apr 11, 2023

williamjameshandley commented Apr 11, 2023

lukashergt left a comment

lukashergt Apr 11, 2023

williamjameshandley Apr 12, 2023

AdamOrmondroyd Apr 12, 2023

lukashergt Apr 12, 2023

AdamOrmondroyd Apr 13, 2023

lukashergt Apr 11, 2023

lukashergt Apr 11, 2023

lukashergt Apr 11, 2023

williamjameshandley Apr 12, 2023

williamjameshandley Apr 12, 2023

lukashergt Apr 22, 2023 •

edited

Loading

AdamOrmondroyd commented Apr 20, 2023

lukashergt commented Apr 21, 2023

AdamOrmondroyd commented Apr 21, 2023

AdamOrmondroyd commented Apr 24, 2023

williamjameshandley commented May 31, 2023

lukashergt commented May 31, 2023

williamjameshandley commented Jun 1, 2023

AdamOrmondroyd commented Jun 4, 2023

Reorganisation into wpandas and lpandas #280

Reorganisation into wpandas and lpandas #280

Conversation

williamjameshandley commented Apr 9, 2023

Description

Checklist:

codecov bot commented Apr 9, 2023 • edited Loading

Codecov Report

AdamOrmondroyd commented Apr 9, 2023

AdamOrmondroyd commented Apr 10, 2023

williamjameshandley commented Apr 11, 2023

lukashergt commented Apr 11, 2023 • edited Loading

williamjameshandley commented Apr 11, 2023

williamjameshandley commented Apr 11, 2023

lukashergt left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lukashergt Apr 22, 2023 • edited Loading

Choose a reason for hiding this comment

AdamOrmondroyd commented Apr 20, 2023

lukashergt commented Apr 21, 2023

AdamOrmondroyd commented Apr 21, 2023

AdamOrmondroyd commented Apr 24, 2023

williamjameshandley commented May 31, 2023

lukashergt commented May 31, 2023

williamjameshandley commented Jun 1, 2023

AdamOrmondroyd commented Jun 4, 2023

codecov bot commented Apr 9, 2023 •

edited

Loading

lukashergt commented Apr 11, 2023 •

edited

Loading

lukashergt Apr 22, 2023 •

edited

Loading