Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Very hard to pick individual pixels on seaborn heatmaps (pcolormesh) #99

Open
steel3d opened this issue Sep 16, 2020 · 14 comments
Open

Comments

@steel3d
Copy link

steel3d commented Sep 16, 2020

image

Here's my test code:

import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from mpldatacursor import datacursor
fig, ax = plt.subplots(figsize=(15, 10))
arr = np.random.rand(200, 200)
sns.heatmap(np.flip(arr.transpose(), axis=0), cmap="cubehelix", ax=ax)
datacursor()
plt.show()

It seems to work ok at default zoom, but when zoomed in, the graph almost never responds to mouse clicks, it's really hard to pick a point you want. Also, the arrow is inaccurate. As you can see in the screenshot, the arrow is pointing to a white pixel, but the z value is showing 0.5. Also, if a pixel is picked in the default zoom it's showing the wrong z value. You can see this when zoomed out, but it's especially obvious once you zoom in, you'll see the z value is not correct.

This is on a 16 core AMD Threadripper machine with 64GB RAM.

@joferkington
Copy link
Owner

Are you using matplotlib inside a notebook, by chance?

The matplotlib notebook backend has quite a few performance drawbacks and can have odd rendering in some browsers. There's not really much that can be done about that from the mpldatacursor side, unfortunately.

@joferkington
Copy link
Owner

If you're not in a notebook, do you know which matplotlib backend you're using?

What's happening (notebook or not) is that things are being refreshed too frequently. Hover mode in matplotlib normally checks if it's drawn recently and only updates if it hasn't, but that can break in some backends. I can't directly reproduce the issue locally, but that certainly doesn't mean it's not there, just that it's not there with the matplotlib backends I can test.

@steel3d
Copy link
Author

steel3d commented Sep 17, 2020

My script is exactly as written, run from command line in python 3.6. So I assume I'm using the default backend. The performance is the same in notebook with %matplotlib notebook.

You can't repro the accuracy issue, either?

I can't test in python 3.8 because I get an exception:

Traceback (most recent call last):
  File "C:\Users\Steve\AppData\Local\Programs\Python\Python38\lib\tkinter\__init__.py", line 1883, in __call__
    return self.func(*args)
  File "C:\Users\Steve\AppData\Local\Programs\Python\Python38\lib\site-packages\matplotlib\backends\_backend_tk.py", line 293, in button_press_event
    FigureCanvasBase.button_press_event(
  File "C:\Users\Steve\AppData\Local\Programs\Python\Python38\lib\site-packages\matplotlib\backend_bases.py", line 1854, in button_press_event
    self.callbacks.process(s, mouseevent)
  File "C:\Users\Steve\AppData\Local\Programs\Python\Python38\lib\site-packages\matplotlib\cbook\__init__.py", line 229, in process
    self.exception_handler(exc)
  File "C:\Users\Steve\AppData\Local\Programs\Python\Python38\lib\site-packages\matplotlib\cbook\__init__.py", line 81, in _exception_printer
    raise exc
  File "C:\Users\Steve\AppData\Local\Programs\Python\Python38\lib\site-packages\matplotlib\cbook\__init__.py", line 224, in process
    func(*args, **kwargs)
  File "C:\Users\Steve\AppData\Local\Programs\Python\Python38\lib\site-packages\mpldatacursor\datacursor.py", line 718, in _select
    self(new_event)
  File "C:\Users\Steve\AppData\Local\Programs\Python\Python38\lib\site-packages\mpldatacursor\datacursor.py", line 235, in __call__
    self._show_annotation_box(event)
  File "C:\Users\Steve\AppData\Local\Programs\Python\Python38\lib\site-packages\mpldatacursor\datacursor.py", line 275, in _show_annotation_box
    self.update(event, annotation)
  File "C:\Users\Steve\AppData\Local\Programs\Python\Python38\lib\site-packages\mpldatacursor\datacursor.py", line 575, in update
    annotation.set_text(self.formatter(**info))
  File "C:\Users\Steve\AppData\Local\Programs\Python\Python38\lib\site-packages\mpldatacursor\datacursor.py", line 348, in _formatter
    x = self._format_coord(x, ax.xaxis)
  File "C:\Users\Steve\AppData\Local\Programs\Python\Python38\lib\site-packages\mpldatacursor\datacursor.py", line 413, in _format_coord
    return formatter.pprint_val(x)
AttributeError: 'ScalarFormatter' object has no attribute 'pprint_val'

@joferkington
Copy link
Owner

joferkington commented Sep 17, 2020

There is no "default backend", for what it's worth. If nothing is specified, matplotlib chooses based on the OS and what packages are available.

From the stacktraces, it looks like you're using TkAgg. That's what I'm using locally, as well, but I can't reproduce the issue. The same code works perfectly with no lag, zoomed in or not. However, I don't have any way to access a Windows machine, so there could be some OS-specific things at play. I'll try to see if I can find a friend with access to Windows to try to reproduce.

FWIW, the error you posted is due to changes in the most recent version of matplotlib. It has been fixed in master.

@steel3d
Copy link
Author

steel3d commented Sep 18, 2020

Can't test on mac due to same bug. When will the fix be available though pip install?

@steel3d
Copy link
Author

steel3d commented Oct 4, 2020

I did pip uninstall and used setup.py install from master. The behavior is still the same as reported on windows python 3.8 and mac python 3.7. It's almost impossible to get a click to register once zoomed in, and the z values don't always reflect what the arrow is pointing at. Sometimes it gets into a state where it doesn't respond to any left flicks. Right click will make the box go away, but no amount of left clicks will bring up the box again at any zoom level.

@steel3d
Copy link
Author

steel3d commented Oct 19, 2020

@joferkington Have you been able to repro the responsiveness and the accuracy problems?

@joferkington
Copy link
Owner

I still haven't been able to reproduce the issues you're seeing, unfortunately. I've gotten access to a Windows laptop and tested things there. Everything seems extremely responsive for me. However, for various reasons (not my machine and didn't want to install anything invasive), I haven't tested on Windows or MacOS with the latest version of matplotlib, which could explain the difference.

@steel3d
Copy link
Author

steel3d commented Oct 19, 2020

Windows is not necessary, as my repro is exactly the same on Mac. You're sure you're running my script as written from command line and zooming to about that zoom level? And you get accurate values? I'm on matplotlib 20.2.3. Can any other packages affect it? Here's some package versions:

Requirement already satisfied: seaborn in c:\users\steve\appdata\local\programs\python\python38\lib\site-packages (0.11.0)
Requirement already satisfied: matplotlib>=2.2 in c:\users\steve\appdata\local\programs\python\python38\lib\site-packages (from seaborn) (3.3.2)
Requirement already satisfied: numpy>=1.15 in c:\users\steve\appdata\local\programs\python\python38\lib\site-packages (from seaborn) (1.19.1)
Requirement already satisfied: scipy>=1.0 in c:\users\steve\appdata\local\programs\python\python38\lib\site-packages (from seaborn) (1.5.2)
Requirement already satisfied: pandas>=0.23 in c:\users\steve\appdata\local\programs\python\python38\lib\site-packages (from seaborn) (1.1.2)
Requirement already satisfied: pillow>=6.2.0 in c:\users\steve\appdata\local\programs\python\python38\lib\site-packages (from matplotlib>=2.2->seaborn) (7.2.0)
Requirement already satisfied: python-dateutil>=2.1 in c:\users\steve\appdata\local\programs\python\python38\lib\site-packages (from matplotlib>=2.2->seaborn) (2.8.1)
Requirement already satisfied: certifi>=2020.06.20 in c:\users\steve\appdata\local\programs\python\python38\lib\site-packages (from matplotlib>=2.2->seaborn) (2020.6.20)
Requirement already satisfied: kiwisolver>=1.0.1 in c:\users\steve\appdata\local\programs\python\python38\lib\site-packages (from matplotlib>=2.2->seaborn) (1.2.0)
Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.3 in c:\users\steve\appdata\local\programs\python\python38\lib\site-packages (from matplotlib>=2.2->seaborn) (2.4.7)
Requirement already satisfied: cycler>=0.10 in c:\users\steve\appdata\local\programs\python\python38\lib\site-packages (from matplotlib>=2.2->seaborn) (0.10.0)
Requirement already satisfied: pytz>=2017.2 in c:\users\steve\appdata\local\programs\python\python38\lib\site-packages (from pandas>=0.23->seaborn) (2020.1)
Requirement already satisfied: six>=1.5 in c:\users\steve\appdata\local\programs\python\python38\lib\site-packages (from python-dateutil>=2.1->matplotlib>=2.2->seaborn) (1.15.0)

@joferkington
Copy link
Owner

It's perfectly responsive for me with exactly that code yes. Especially when zoomed in.

However, note that sns.heatmap plots things with pcolormesh which means that matplotlib only considers clicks on the edges of cells as valid. That's an underlying matplotlib restriction.

As a result, if you click the center of a cell, it won't register.

If you plot with imshow (which is much better suited to your use case) instead of pcolormesh (which isn't meant for large regular arrays that can be displayed with imshow), clicking the centers of the cells would work fine.

@joferkington
Copy link
Owner

To get a sense of what I'm talking about, try using imshow. Notice that it will trigger anywhere within the cell, not just at the edges:

import numpy as np
import matplotlib.pyplot as plt
from mpldatacursor import datacursor

fig, ax = plt.subplots(figsize=(15, 10))
arr = np.random.rand(200, 200)
ax.imshow(arr, cmap='cubehelix', interpolation='none')
datacursor()
plt.show()

@steel3d
Copy link
Author

steel3d commented Oct 19, 2020

Ok, well that's the issue. Your sample works well. sns "works" if I select very close to a pixel edge. But you have to have secret knowledge to know which edge pertains to which pixel (even if the arrow points to the right side of an edge, it will give you the left pixel value), Seems dumb that matplotlib can't pick inside a mesh polygon.

Anyways, problem solved. Thanks.

@joferkington
Copy link
Owner

joferkington commented Oct 19, 2020

@steel3d - For what it's worth, I've considered hacking around the limitations of pcolormesh (or, more accurately, the QuadMesh artist) interactions in the past so that logical cells (i.e. what you see) can be selected instead of edges.

It's not impossible, but it's better addressed with changes to matplotlib, rather than changes to mpldatacursor.

The last time I dug into it, it was easy to do for simple rectangular cases, but harder to do for the generic cases that QuadMesh supports. I don't know that I'll have time in the near future to look back into the issue, but I could give you or someone else an overview of what the changes might look like, if you wanted to tackle it or open a feature request ticket for matplotlib.

@steel3d
Copy link
Author

steel3d commented Oct 19, 2020

Sorry, I don't even have time to work on my own stuff, not to mention stuff like this :) Hopefully at least this will save others some time, now that it's a known issue.

@steel3d steel3d changed the title Performance and consistency is extremely poor on large heatmaps Very hard to pick individual pixels on seaborn heatmaps (pcolormesh) Oct 19, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants