Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhance performance of _paint in glyph_renderer #14427

Open
muendlein opened this issue Mar 17, 2025 · 4 comments · May be fixed by #14428
Open

Enhance performance of _paint in glyph_renderer #14427

muendlein opened this issue Mar 17, 2025 · 4 comments · May be fixed by #14428

Comments

@muendlein
Copy link
Contributor

Problem description

Performance when rendering large scatter datasets with WebGL is mainly limited by CPU tasks.

Feature description

Reduce overhead caused by indices.

Potential alternatives

--

Additional information

No response

@muendlein muendlein linked a pull request Mar 17, 2025 that will close this issue
2 tasks
@bryevdv
Copy link
Member

bryevdv commented Mar 17, 2025

HI @muendlein definitely appreciate the focus on performance improvements but these issues and PRs are a little light on details. Can provide a little more by way of explaining the problem, the solution, how much improvement there is, and in what circumstances? (Ideally with some minimal profiling documentation, e.g. even just screenshots as in #14262)

@ianthomas23
Copy link
Member

The performance of WebGL in Bokeh is indeed limited by a large number of JavaScript calculations per rendered frame, such as indices calculations, retransforming every data point to screen coordinates (#11369 item 8), and so on. This isn't surprising, the architecture of Bokeh was designed to be good for canvas rendering and the WebGL glyphs have to work around this. Piecemeal improvement isn't really possible, to take full advantage of the benefits of WebGL would require significant architectural change.

If someone had the time and motivation for such an architectural change, I still would not encourage it. That level of energy would better off spent on WebGPU, WebWorkers and WebAssembly for example.

@muendlein
Copy link
Contributor Author

muendlein commented Mar 18, 2025

@bryevdv You are fully right about the missing details. Let me elaborate a bit more.

The problem:

  • bitset currently uses generators to retrieve indices -> inefficient
  • _paint calls the spread operator (e.g. [...this.all_indices]) -> inefficient

The solution:

  • Avoid calling generators -> plain function
  • Avoid the spread operator by utilizing the return array directly.

Performance testing:

Example script that plots 1 million scatter points (2 different markers each 500k):

import numpy as np
from bokeh.io import curdoc
from bokeh.models import ColumnDataSource
from bokeh.plotting import figure

n_splits = 50
n_split_samples = 10_000

cds_scatter = ColumnDataSource(data={"x":[], "x2": [], "y":[]})

for i in range(n_splits):
    x_data = np.arange(n_split_samples)
    y_data = np.sin(x_data/100) + x_data/500*np.random.randn()

    cds_scatter.stream({"x": x_data, "x2": -x_data, "y": y_data})

p1 = figure(title="Scatter", output_backend="webgl")

p1.scatter(x="x", y="y", source=cds_scatter, size=1)
p1.scatter(x="x2", y="y", source=cds_scatter, size=3, marker="square")

curdoc().add_root(p1)

Frame times when zooming out or panning:

Chrome: Speed up ~2.65x
Baseline: ~265 ms
Optimized: ~100 ms

Image
Image

Firefox: Speed ~2.69x
Baseline: ~713 ms
Optimized: ~265 ms

Image
Image

Please note that my other PRs address further bottlenecks (_search_indices & set_data) that become obvious after this optimization.

Overall speed up with PR #14419 and #14421 is ~7x.
Chrome:
Image
Firefox:
Image

@ianthomas23 To be clear I'm only addressing some low hanging fruits that are currently present with markers.

@bryevdv
Copy link
Member

bryevdv commented Mar 18, 2025

@muendlein very nice thank you very much for the added context.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants