-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Crashes reading a large file #71
Comments
Hello Jon, thanks for trying the extension and for the feedback ! Axis selection
Well, h5web is a "dumb" viewer: it will only display visualizations corresponding to the content of the file. It is not meant to be a visualization tool. Reasons of the crash when reading a large datasetThis is due to a limitation in the Line visualisation: we have a feature (auto-scale off) where the axis limits are set to the limits of the full dataset. As a consequence, when using the Line, h5web fetches the full dataset. In this case, I believe this is around 256 GB (:scream:) making the whole Jupyter server crash. I still need to investigate the exact reason. Note that the Heatmap suffers not from this limitation: it only fetches the slice. This is why the first display of What is next, then?
It would indeed make sense to fetch only the slice even for a Line visualization. The Auto-scale feature puts a large limitation for large datasets and we need to work somehow around that. We have an issue in h5web where we track our ideas and improvements to fetch large datasets: silx-kit/h5web#616. The discussion about the auto-scale will surely continue there and any implementation fixing the crash will be mentioned there. In the mean time, use the Heatmap ? 😅 |
@jonwright thanks for the +ve feedback. @loichuder thanks for the explanations. It seems like we are missing a tool to do flexible viewing of Nexus files i.e. selecting what to display against what. AM I right to say that users have to build their own tool with a mixture of h5py and matplotlib for now? Does bragy address this? |
This is outside of the scope of Braggy, for sure. It's always possible to make a new GUI, but note that a solution to this problem is to generate a NeXus-compliant HDF5 file with external links to the relevant datasets, and then open this file in H5Web. Obviously not as practical as a GUI, but we could easily provide Python utilities to make generating this sort of file a breeze (perhaps these utilities already exist, even). |
There is already some helpers to save Otherwise since this runs in a notebook, using BTW, in |
Following on the crash issue, we have something in the works to solve it: silx-kit/h5web#616 (comment) I will close this once this is shipped in a jupyterlab-h5web release. |
silx-kit/h5web#616 (comment) was integrated in v0.1.0 that is now deployed in jupyter-slurm. |
I am assuming this is the project behind the wonderful thing I found yesterday that lets me browse hdf5 files in jupyterlab? It looks fantastic. I wish I could figure out how to select x and y axes for a plot? I always see data versus point number. The rest of the message is a bug report for how I seem to have broke something already (sorry!) :
Describe the bug
jupyterlab crashes when reading large dataset, perhaps an out of memory error?
To Reproduce
1 - Log into jupyter-slurm.esrf.fr with one single core and the lab interface
2 - Navigate to open : /data/id11/nanoscope/blc12407/id11/CeO2_38keV/CeO2_38keV_CeO2_rotation/CeO2_38keV_CeO2_rotation.h5
3 - open dataset /1.1/measurement/eiger : it displays
4 - open dataset /1.1/measurement/fpico6 : it displays
5 - go back to /1.1/measurement/eiger : jupyterlab stops running
6 - all the other tabs and kernels appear to exit when jupyterlab fails
Expected behaviour
In the worst case, a plugin would crash without taking down all of the other kernels. Ideally it would not crash.
Is there a way to use hdf5 slice operations (maybe combined with fast histograms) so you only hold in memory what is going to be displayed on the screen (e.g. maximum data is a 2D image)? Then libhdf5 should manage the memory cache in some sensible way.
Context
Extension lists
This is based on a bit of guesswork as to what is actually running when I use jupyter-slurm :The text was updated successfully, but these errors were encountered: