Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Libraries independency #18

Open
foteni opened this issue Dec 8, 2024 · 7 comments
Open

Libraries independency #18

foteni opened this issue Dec 8, 2024 · 7 comments

Comments

@foteni
Copy link

foteni commented Dec 8, 2024

Hi, I am trying to run the repository to google collab but I am stuck on this error when I !pip install . :
Using cached tensorboard_plugin_wit-1.8.1-py3-none-any.whl (781 kB)
Building wheels for collected packages: splinedist, numba, llvmlite
Building editable for splinedist (pyproject.toml) ... done
Created wheel for splinedist: filename=splinedist-0.1.2-0.editable-py3-none-any.whl size=6168 sha256=a8cdbdf305336d39cebfd14ee42a762d98d30a79317f2d1cfc0a2d69313ffc1f
Stored in directory: /tmp/pip-ephem-wheel-cache-mt9a3qbp/wheels/08/7a/e9/4a9106dc672109ff7c3930ddbb9ba70435fddeb520a059c510
error: subprocess-exited-with-error

× python setup.py bdist_wheel did not run successfully.
│ exit code: 1
╰─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip.
Building wheel for numba (setup.py) ... error
ERROR: Failed building wheel for numba
Running setup.py clean for numba
error: subprocess-exited-with-error

× python setup.py bdist_wheel did not run successfully.
│ exit code: 1
╰─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip.
Building wheel for llvmlite (setup.py) ... error
ERROR: Failed building wheel for llvmlite
Running setup.py clean for llvmlite
Successfully built splinedist
Failed to build numba llvmlite
ERROR: ERROR: Failed to build installable wheels for some pyproject.toml based projects (numba, llvmlite)

@sohmandal
Copy link
Member

Hi @foteni, thank you for your interest in SplineDist!

After checking, it seems that there is some issue with the build of numba==0.51.2 in the Google Colab. To avoid that it looks like numba==0.55.0 could be an alternative there! Could you please try that by changing the setup file here?

CC: @vuhlmann

@foteni
Copy link
Author

foteni commented Dec 14, 2024

Hi again! I tried your suggestion and it worked! Thank you a lot for your time.

I have one more question. I use the library ipynb in order to run the files. I try to run this cell:

from ipynb.fs.full import training

in order to do the training and in the start of the run I got something like this:

RuntimeError                              Traceback (most recent call last)
RuntimeError: module was compiled against NumPy C-API version 0x10 (NumPy 1.23) but the running NumPy has C-API version 0xe. Check the section C-API incompatibility at the Troubleshooting ImportError section at https://numpy.org/devdocs/user/troubleshooting-importerror.html#c-api-incompatibility for indications on how to solve this problem.

One of the solution that I found by googling is to upgrade the numpy but I know that if I do so it will break other dependencies.

Then I tried to run cell by cell to find out more information about the error and I found out that the cell that produces the error is this:

axis_norm = (0,1)   # normalize channels independently
# axis_norm = (0,1,2) # normalize channels jointly
if n_channel > 1:
    print("Normalizing image channels %s." % ('jointly' if axis_norm is None or 2 in axis_norm else 'independently'))
    sys.stdout.flush()

X = [normalize(x,1,99.8,axis=axis_norm) for x in tqdm(X)]
Y = [fill_label_holes(y) for y in tqdm(Y)]

I still could run the next cells and I did, up until this cell:

model.train(X_trn, Y_trn, validation_data=(X_val,Y_val), augmenter=augmenter, epochs = 300)

in which the output was:

 /usr/local/lib/python3.10/dist-packages/keras/optimizers/optimizer_v2/adam.py:114: UserWarning: The `lr` argument is deprecated, use `learning_rate` instead.
  super().__init__(name, **kwargs)

Also, in that moment that RAM crushes and I have to restart the runtime period.

Then I used only 100 images and it run! Here was the output:


 Epoch 1/2
100/100 [==============================] - ETA: 0s - loss: 1.6554 - prob_loss: 0.2379 - dist_loss: 7.0875 - prob_kld: 0.1637 - dist_relevant_mae: 7.0755 - dist_relevant_mse: 103.1758
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
RuntimeError: module was compiled against NumPy C-API version 0x10 (NumPy 1.23) but the running NumPy has C-API version 0xe. Check the section C-API incompatibility at the Troubleshooting ImportError section at https://numpy.org/devdocs/user/troubleshooting-importerror.html#c-api-incompatibility for indications on how to solve this problem.
3/3 [==============================] - 1s 117ms/step
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
RuntimeError: module was compiled against NumPy C-API version 0x10 (NumPy 1.23) but the running NumPy has C-API version 0xe. Check the section C-API incompatibility at the Troubleshooting ImportError section at https://numpy.org/devdocs/user/troubleshooting-importerror.html#c-api-incompatibility for indications on how to solve this problem.
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
RuntimeError: module was compiled against NumPy C-API version 0x10 (NumPy 1.23) but the running NumPy has C-API version 0xe. Check the section C-API incompatibility at the Troubleshooting ImportError section at https://numpy.org/devdocs/user/troubleshooting-importerror.html#c-api-incompatibility for indications on how to solve this problem.
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
RuntimeError: module was compiled against NumPy C-API version 0x10 (NumPy 1.23) but the running NumPy has C-API version 0xe. Check the section C-API incompatibility at the Troubleshooting ImportError section at https://numpy.org/devdocs/user/troubleshooting-importerror.html#c-api-incompatibility for indications on how to solve this problem.
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
RuntimeError: module was compiled against NumPy C-API version 0x10 (NumPy 1.23) but the running NumPy has C-API version 0xe. Check the section C-API incompatibility at the Troubleshooting ImportError section at https://numpy.org/devdocs/user/troubleshooting-importerror.html#c-api-incompatibility for indications on how to solve this problem.
100/100 [==============================] - 478s 5s/step - loss: 1.6554 - prob_loss: 0.2379 - dist_loss: 7.0875 - prob_kld: 0.1637 - dist_relevant_mae: 7.0755 - dist_relevant_mse: 103.1758 - val_loss: 2.1338 - val_prob_loss: 0.2454 - val_dist_loss: 9.4417 - val_prob_kld: 0.1776 - val_dist_relevant_mae: 9.4297 - val_dist_relevant_mse: 185.1811 - lr: 3.0000e-04
Epoch 2/2
3/3 [==============================] - 0s 115ms/step
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
RuntimeError: module was compiled against NumPy C-API version 0x10 (NumPy 1.23) but the running NumPy has C-API version 0xe. Check the section C-API incompatibility at the Troubleshooting ImportError section at https://numpy.org/devdocs/user/troubleshooting-importerror.html#c-api-incompatibility for indications on how to solve this problem.
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
RuntimeError: module was compiled against NumPy C-API version 0x10 (NumPy 1.23) but the running NumPy has C-API version 0xe. Check the section C-API incompatibility at the Troubleshooting ImportError section at https://numpy.org/devdocs/user/troubleshooting-importerror.html#c-api-incompatibility for indications on how to solve this problem.
100/100 [==============================] - 474s 5s/step - loss: 1.6047 - prob_loss: 0.2439 - dist_loss: 6.8038 - prob_kld: 0.1685 - dist_relevant_mae: 6.7918 - dist_relevant_mse: 96.5648 - val_loss: 1.8901 - val_prob_loss: 0.2140 - val_dist_loss: 8.3803 - val_prob_kld: 0.1462 - val_dist_relevant_mae: 8.3682 - val_dist_relevant_mse: 146.8352 - lr: 3.0000e-04

Loading network weights from 'weights_best.h5'.
<keras.callbacks.History at 0x7a8fb2ba2c80>

Last thing, the GPU of Google Colab wasn't enable when i ran the train function .
Sorry for the big message. Do you want me to create a new issue for the following errors or shall we continue here?

@sohmandal
Copy link
Member

Hi @foteni , you are right, different numpy versions than the recommend one in the setup file might break some other dependencies! I assume that numba==0.55.0 is creating some conflicts, but this I have not thoroughly tested yet. However I do not think this warning would output into incorrect results in this context. Nonetheless, if you have options to run the files where you could install SplineDist with all the recommended versions of dependencies, please go ahead with that. Also, please make sure that you are able to run all the cells including that normalisation and morphological filling cell that you have pointed out in the above, without any terminating with erros.

Regarding Google colab crashing, I suspect it is happening due to large value of contoursize_max of your dataset in the training script, which is not being possible to fit in the available RAM. One solution would be to reduce its value to make it fit to the RAM, albeit that might sacrifice the overall result quality a bit.

Unfortunately, I am unsure why the Google Colab GPU was not enabled during your training session.

Hope this is helpful!

@foteni
Copy link
Author

foteni commented Dec 18, 2024

Thank you very much for the prompt reply. Unfortunately I don't have any hardware so I'm trying it with google colab so I'll leave it for now as it is and ignore this point. However, if you try it and find out that it affects the results and find some solution that can help me, I would be grateful. Finally, if it's not trouble for you, i wanted to ask how much VRAM is required to run the model with data. Thank you !

@sohmandal
Copy link
Member

Hi @foteni , you are welcome and as I have mentioned, I do not think that NumPy warning would result into incorrect results in this context! So, please feel free to go ahead with the installation change suggested in this thread!

Could you please elaborate what you meant by- 'how much does the model take?'? Thanks!

@foteni
Copy link
Author

foteni commented Dec 22, 2024

Hi @foteni , you are welcome and as I have mentioned, I do not think that NumPy warning would result into incorrect results in this context! So, please feel free to go ahead with the installation change suggested in this thread!

Could you please elaborate what you meant by- 'how much does the model take?'? Thanks!

Finally, if it's not trouble for you, i wanted to ask how much VRAM is required to run the model with data. Thank you !

@sohmandal
Copy link
Member

Hi @foteni, the VRAM requirement depends on multiple factors, such as the input dataset format and some SplineDist parameters (number of control points (M), contoursize_max, etc.) being used.

Not sure whether it would be helpful in your case, but there are some open-source pretrained SplineDist models available as noted in the Readme.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants