Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Smaller mark width for overlapping data with so.Hist(common_bins=False) #3769

Open
maurosilber opened this issue Oct 19, 2024 · 1 comment
Open

Comments

@maurosilber
Copy link

When doing a so.Hist(common_bins=False), if the bins for each group overlap, the width calculated for each mark is smaller that it should be.

Here's a minimal working example, where I have a dataset A, and its x-shifted version B = A + shift. In each row, I'm plotting a different shift, and when they start overlapping, the bar width is smaller than the bin width.

image

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import seaborn.objects as so


def plot(ax, shift: float):
    data = np.random.default_rng(0).normal(size=50)
    df = pd.DataFrame({"A": data, "B": data + shift}).melt()
    return (
        so.Plot(df, x="value", color="variable")
        .add(so.Bars(), so.Hist(common_bins=False))
        .on(ax)
        .plot()
    )


shifts = [4.5, 4, 3.9, 3.8]
fig, axes = plt.subplots(len(shifts), sharex=True, gridspec_kw={"hspace": 0})
for ax, shift in zip(axes, shifts):
    plot(ax, shift)
    ax.set(ylabel=f"{shift = }")

I could trace it to this width calculation:

spacing = scales[orient]._spacing(view_df.loc[view_idx, orient])

which ends up running the following line for all groups as one:
return np.min(np.diff(np.sort(x)))

If the bin edges are [0, 1, 2] and [0.5, 1.5, 2.5] for each group, it calculates the bin width from [0, 0.5, 1, 1.5, ...] and finds a width of 0.5 instead of a width of 1.

Maybe this is not a bug but something by design when there is overlap between marks?

In case it is a bug, I could contribute a fix, but would probably need some direction as to where to fix it.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants