-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WeightedGroupBy #272
WeightedGroupBy #272
Conversation
Codecov Report
@@ Coverage Diff @@
## master #272 +/- ##
=========================================
Coverage 100.00% 100.00%
=========================================
Files 30 30
Lines 2555 2643 +88
=========================================
+ Hits 2555 2643 +88
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
I'm aware coverage won't be complete as the test only considers the mean. Currently trying to think of a more efficient way of writing the tests |
The latest comment sounded in favour of the fix, so I'd still have hope... What does the |
|
@lukashergt The failing tests are to do with building documentation, which I'll be honest I could do with some pointers of what you might like to see here. Edit: sticking underscores in front of everything hasn't fixed everything, and I probably should actually take care with the documentation for groupby |
…tion" This reverts commit 244cd1f.
…erencing pandas is a pain, some of its classes lack docs
* `GroupBy` does not have its own docs. - Its initialisation signature looks like a core dump, hence implementing our own initialisation function. - Trying to cross-reference as ``:class:`pandas.core.groupby.GroupBy` `` will fail, hence dropping the link attempt. Same goes for `SeriesGroupBy` and `DataFrameGroupBy`. * Dropping `kurtosis` from `WeightedGroupBy`, since it is not implemented in `pandas.core.croupby.GroupBy`. Leave that to tacke once/if/when we really need it. * Add docstring adjustments for `WeightedDataFrameGroupBy` and `WeightedSeriesGroupBy` to the end of `weighted_pandas` in the same way as previously done for `WeightedDataFrame` and `WeightedSeries`.
…ince they have essentially no documentation anyhow
I guess I should probably test all the plot types... |
def kurt(self, *args, **kwargs): # noqa: D102 | ||
return self._add_weights("kurt", *args, **kwargs) | ||
|
||
def kurtosis(self, *args, **kwargs): # noqa: D102 | ||
return self._add_weights("kurtosis", *args, **kwargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to define kurt
and kurtosis
here, but don't need to define skew
? Yet skew
still passes the tests...?
I first thought this had to do with pandas GroupBy
not overriding skew
as opposed to mean
, median
, std
and var
. However, the same would go for kurt
and kurtosis
, so now I wonder what is different/special about skew
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fyi I've found that skew
needs adding for pandas 2 (otherwise it fails the is weighted test), so could add it in here now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A little bit of an odd one -- pandas 1.5 doesn't actually implement groupby kurt and kurtosis (despite the fact I believe it could). @AdamOrmondroyd does it need adding for pandas 2? My instinct is to leave it in, as it doesn't do any harm.
Is there anything else which needs to be done? |
I was working my way through testing all the plots, but so far I haven't found any issues with the plots themselves |
Had a fruitless look into why it's failing on pip but not conda, isn't as simple as numpy 1.23 vs 1.24 |
I think it was a numerical/floating point thing where the covariance matrix turned out positive definite despite having a linear dependent parameter. Anyhow, more helped more (85fa6ae)... |
@lukashergt if you're happy with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, with the Gelman--Rubin statistic fixed thanks to the functioning groupby I am happy now. If you are happy too, feel free to squash and merge.
Thanks, @AdamOrmondroyd, @williamjameshandley
Description
Statistics of
WeightedDataFrame.groupby
such asmean()
ignore weights. Since my ambitious attempt to fix this in pandas itself is looking very unlikely to get through, I have attempted to fix it here. However, they have suggested adding aself._gb_cls
attribute toSeries
andDataFrame
, which.groupby()
returns. For now, I have copied the methods.Fixes #260
Checklist:
flake8 anesthetic tests
)pydocstyle --convention=numpy anesthetic
)python -m pytest
)mean().mean()
case working (test currently commented out). This seems like it would require significantly more work...