Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

check the provided bins arguments and raise an Error if wrong #275

Merged
merged 1 commit into from
Jul 30, 2024

Conversation

jkittner
Copy link
Contributor

previously when specifying bins that did not include all (wind speed-) data, the data was simply not displayed and silently ignored.

it now raises an error and displays a helpful message.


Some example:

import pandas as pd
from windrose import WindroseAxes
import matplotlib.pyplot as plt

df = pd.DataFrame({
    'wind_dir': [360, 360, 360, 360, 360, 180, 180, 180, 180, 180],
    'wind_speed': [1, 2, 3, 4, 5, 1, 2, 3, 4, 5,],
})

We define custom bins

everything works as expected

ax = WindroseAxes.from_ax()
ax.bar(df['wind_dir'], df['wind_speed'], normed=True, bins=[0, 1, 2, 3])
ax.set_legend()
rgrids = [0, 10, 20, 25, 30, 33, 40, 50]
ax.set_rgrids(rgrids, rgrids)
The resulting plot

grafik

We define custom bins not starting at 0

We're silently losing 20% of our data (everything below the lowest bin)!

ax = WindroseAxes.from_ax()
ax.bar(df['wind_dir'], df['wind_speed'], normed=True, bins=[2, 3])
ax.set_legend()
rgrids = [0, 10, 20, 25, 30, 33, 40, 50]
ax.set_rgrids(rgrids, rgrids)
The resulting plot

grafik

We define custom bins and a calm_limim. The custom bins are below the calm_limit

We're getting an incorrect legend showing bins with data that was excluded with the calm_limit filter. The plot itself is correct.

ax = WindroseAxes.from_ax()
ax.bar(df['wind_dir'], df['wind_speed'], normed=True, bins=[0, 1, 2, 3], calm_limit=2)
ax.set_legend()
rgrids = [0, 10, 20, 25, 30, 33, 40, 50]
ax.set_rgrids(rgrids, rgrids)
The resulting plot

grafik

We will now get an error. For the first example:

Traceback (most recent call last):
  File "/home/kittnjdr/workspace/windrose/t.py", line 28, in <module>
    ax.bar(df['wind_dir'], df['wind_speed'], normed=True, bins=[0, 1, 2, 3], calm_limit=2)
  File "/home/kittnjdr/workspace/windrose/windrose/windrose.py", line 640, in bar
    bins, nbins, nsector, colors, angles, kwargs = self._init_plot(
                                                   ^^^^^^^^^^^^^^^^
  File "/home/kittnjdr/workspace/windrose/windrose/windrose.py", line 385, in _init_plot
    raise ValueError(
ValueError: the lowest value in bins must be >= 2 (=calm_limits)

For the 2nd example:

Traceback (most recent call last):
  File "/home/kittnjdr/workspace/windrose/t.py", line 20, in <module>
    ax.bar(df['wind_dir'], df['wind_speed'], normed=True, bins=[2, 3])
  File "/home/kittnjdr/workspace/windrose/windrose/windrose.py", line 640, in bar
    bins, nbins, nsector, colors, angles, kwargs = self._init_plot(
                                                   ^^^^^^^^^^^^^^^^
  File "/home/kittnjdr/workspace/windrose/windrose/windrose.py", line 377, in _init_plot
    raise ValueError(
ValueError: the first value provided in bins must be less than or equal to the minimum value of the wind speed data. Did you mean: bins=(0, 2, 3) ? If you want to exclude values below a certain threshold, try setting calm_limit=2.

I think raising an error is the correct way. We force the user to be explicit. We could of course silently change the bins value according to calm_limit or the minimum data. However, the user was already explicit by manually providing bins and/or calm_limit as kwargs. They might as well provide the correct values.

The way it is now gave me quite a headache debugging why things did not add up to 100%.

In my example for the subplots in usage.ipynb this was correct (by accident).

Happy to hear your thoughts on this!

...ah there was a ruff warning I fixed in pyproject.toml
...in the baseline plots only the legend changed from 0.0 to 0.2 (the calm_limit)

previously when specifying bins that did not include all data, the data was simply not displayed

it now raises an error and displays a helpful message
@jkittner
Copy link
Contributor Author

The font differences got me again...
I am unable to produce baselines locally. I had this in the past with testing ploty. I had to manually set font_family='DejaVu Sans'. I think we need to explicitly specify a font that is present on every machine. That's of course tricky when testing on windows and mac...

@ocefpaf
Copy link
Collaborator

ocefpaf commented Jul 30, 2024

The font differences got me again... I am unable to produce baselines locally. I had this in the past with testing ploty. I had to manually set font_family='DejaVu Sans'. I think we need to explicitly specify a font that is present on every machine. That's of course tricky when testing on windows and mac...

Or we can set a higher tolerance. Probably OK to avoid small fonts diffs.

@jkittner
Copy link
Contributor Author

Yeah, I though about that as well, but we still want to catch differences in legend entries or incorrect labels. It's hard to get the tolerance right...
I wish there was some sort of semantic testing via vector output of some sort. However than it's hard to actually show what differs to the user.

@ocefpaf
Copy link
Collaborator

ocefpaf commented Jul 30, 2024

I think raising an error is the correct way. We force the user to be explicit. We could of course silently change the bins value according to calm_limit or the minimum data. However, the user was already explicit by manually providing bins and/or calm_limit as kwargs. They might as well provide the correct values.

Raising and error is better IMO.

I wish there was some sort of semantic testing via vector output of some sort. However than it's hard to actually show what differs to the user.

Indeed. I'd love to remove the fonts as a variable in most of my image comparisons. Anyways, it is still amazing that we can do that kind of tests anyway.

Thanks for the PR. IMO it is good to go. I'll mint a new bugfix release ASAP.

@ocefpaf ocefpaf merged commit fa46b03 into python-windrose:main Jul 30, 2024
15 checks passed
@jkittner jkittner deleted the check-custom-bins branch July 30, 2024 09:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants