Help rework code to prevent kernel crash? #166
-
So what's happening is that when I run the code below, the kernel crashes after 15 minutes. If anyone could give me suggestions on how I can rework it to prevent a kernel crash, I'd really appreciate it. Basically what the code does is get temperature data for two adjacent MODIS tiles for the last 5 years and stitch them together. Should I split the code into smaller functions? Or break the data down into smaller chunks? Can dask help me in any way here? Should I parallelize this code? The way the code is written right now, it's hitting planetary computer's memory and processing limits. How do I solve this problem? Here's the code:
And this is the error message i get in jupyter notebook:
|
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 2 replies
-
Most likely you are hitting the memory limit of the notebook server node. The taskbar at the bottom should give you an indication of whether that's true. If so, you'll want to find out where the memory usage is spiking. You can either do that manually, by stepping through your code line-by-line, or use a memory profiler. I've used https://pypi.org/project/memory-profiler/, and https://bloomberg.github.io/memray/ is supposed to be nice. Once you've determined where the issue is Dask might be able to help. I'm not sure if the functions you're using work well with Dask arrays (i.e. without converting them to a single large NumPy array), but the memory profiler should be able to help there. |
Beta Was this translation helpful? Give feedback.
-
When I had a similar problem, I fixed it by lowering the resolution so it uses overviews (lowering the memory footprint) or making smaller chunks so each job also uses less memory. |
Beta Was this translation helpful? Give feedback.
Most likely you are hitting the memory limit of the notebook server node. The taskbar at the bottom should give you an indication of whether that's true.
If so, you'll want to find out where the memory usage is spiking. You can either do that manually, by stepping through your code line-by-line, or use a memory profiler. I've used https://pypi.org/project/memory-profiler/, and https://bloomberg.github.io/memray/ is supposed to be nice.
Once you've determined where the issue is Dask might be able to help. I'm not sure if the functions you're using work well with Dask arrays (i.e. without converting them to a single large NumPy array), but the memory profiler should be able to help there.