-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error when python test_env.py --env AntEnv #12
Comments
I solved it after I changed to use cuda 11.7, maybe this project doesn't support the latest version of cuda, if someone can run it on the latest version of cuda, I'll appreciate it a lot if you can share it |
I meet the same case : ( |
Similar error when running |
After changing my cuda to 11.7, the problem still exists. |
same issue here window11 CUDA12.2 python3.8 torch2.2.0 |
The issue is because this line here assumes the minimum compute capability is 35 Lines 1860 to 1861 in a4c0dd1
However after Cuda12, the minimum support version is 50: https://forums.developer.nvidia.com/t/nvcc-fatal-unsupported-gpu-architecture-compute-35/247815 I solve the issue after chance this line to: cuda_flags = ['-gencode=arch=compute_86,code=compute_86'] I'm using CUDA12.2 and pytorch2.3.1 with RTX3060 on Ubuntu20.04 LST I Found this link is also helpful https://stackoverflow.com/questions/68496906/pytorch-installation-for-different-cuda-architectures |
I installed the pytorch using conda install pytorch==1.13.1 torchvision==0.14.1 torchaudio==0.13.1 pytorch-cuda=11.7 -c pytorch -c nvidia in the cuda12.2, NVIDIA 4090, ubuntu20.04 system. following the @shizhec, i check the arch of my system: (diff) bigeast@bigeast:~/DiffRL/examples$ nvcc --list-gpu-arch
compute_50
compute_52
compute_53
compute_60
compute_61
compute_62
compute_70
compute_72
compute_75
compute_80
compute_86
compute_87
compute_89
compute_90 so i change the cuda_flags to: cuda_flags = ['-gencode=arch=compute_86,code=compute_86'] But I still have the bug following: (diff) bigeast@bigeast:~/DiffRL/examples$ python test_env.py --env AntEnv
Rebuilding kernels
Detected CUDA files, patching ldflags
Emitting ninja build file /home/bigeast/DiffRL/dflex/dflex/kernels/build.ninja...
Building extension module kernels...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
[1/3] /home/bigeast/anaconda3/envs/diff/bin/nvcc -DTORCH_EXTENSION_NAME=kernels -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/home/bigeast/DiffRL/dflex/dflex -isystem /home/bigeast/anaconda3/envs/diff/lib/python3.8/site-packages/torch/include -isystem /home/bigeast/anaconda3/envs/diff/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /home/bigeast/anaconda3/envs/diff/lib/python3.8/site-packages/torch/include/TH -isystem /home/bigeast/anaconda3/envs/diff/lib/python3.8/site-packages/torch/include/THC -isystem /home/bigeast/anaconda3/envs/diff/include -isystem /home/bigeast/anaconda3/envs/diff/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_86,code=sm_86 --compiler-options '-fPIC' -gencode=arch=compute_86,code=sm_86 -std=c++14 -c /home/bigeast/DiffRL/dflex/dflex/kernels/cuda.cu -o cuda.cuda.o
FAILED: cuda.cuda.o
/home/bigeast/anaconda3/envs/diff/bin/nvcc -DTORCH_EXTENSION_NAME=kernels -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/home/bigeast/DiffRL/dflex/dflex -isystem /home/bigeast/anaconda3/envs/diff/lib/python3.8/site-packages/torch/include -isystem /home/bigeast/anaconda3/envs/diff/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /home/bigeast/anaconda3/envs/diff/lib/python3.8/site-packages/torch/include/TH -isystem /home/bigeast/anaconda3/envs/diff/lib/python3.8/site-packages/torch/include/THC -isystem /home/bigeast/anaconda3/envs/diff/include -isystem /home/bigeast/anaconda3/envs/diff/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_86,code=sm_86 --compiler-options '-fPIC' -gencode=arch=compute_86,code=sm_86 -std=c++14 -c /home/bigeast/DiffRL/dflex/dflex/kernels/cuda.cu -o cuda.cuda.o
In file included from /usr/include/cuda_runtime.h:83,
from <command-line>:
/usr/include/crt/host_config.h:138:2: error: #error -- unsupported GNU version! gcc versions later than 8 are not supported!
138 | #error -- unsupported GNU version! gcc versions later than 8 are not supported!
| ^~~~~
[2/3] c++ -MMD -MF main.o.d -DTORCH_EXTENSION_NAME=kernels -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/home/bigeast/DiffRL/dflex/dflex -isystem /home/bigeast/anaconda3/envs/diff/lib/python3.8/site-packages/torch/include -isystem /home/bigeast/anaconda3/envs/diff/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /home/bigeast/anaconda3/envs/diff/lib/python3.8/site-packages/torch/include/TH -isystem /home/bigeast/anaconda3/envs/diff/lib/python3.8/site-packages/torch/include/THC -isystem /home/bigeast/anaconda3/envs/diff/include -isystem /home/bigeast/anaconda3/envs/diff/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++14 -Z -O2 -DNDEBUG -c /home/bigeast/DiffRL/dflex/dflex/kernels/main.cpp -o main.o
/home/bigeast/DiffRL/dflex/dflex/kernels/main.cpp: In function ‘df::float3 box_sdf_grad_cpu_func(df::float3, df::float3)’:
/home/bigeast/DiffRL/dflex/dflex/kernels/main.cpp:1051:47: warning: control reaches end of non-void function [-Wreturn-type]
1051 | var_58 = df::select(var_56, var_53, var_57);
| ^
ninja: build stopped: subcommand failed.
Traceback (most recent call last):
File "/home/bigeast/anaconda3/envs/diff/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1900, in _run_ninja_build
subprocess.run(
File "/home/bigeast/anaconda3/envs/diff/lib/python3.8/subprocess.py", line 516, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "test_env.py", line 17, in <module>
import envs
File "/home/bigeast/DiffRL/envs/__init__.py", line 8, in <module>
from envs.dflex_env import DFlexEnv
File "/home/bigeast/DiffRL/envs/dflex_env.py", line 15, in <module>
import dflex as df
File "/home/bigeast/DiffRL/dflex/dflex/__init__.py", line 15, in <module>
kernel_init()
File "/home/bigeast/DiffRL/dflex/dflex/sim.py", line 67, in kernel_init
kernels = df.compile()
File "/home/bigeast/DiffRL/dflex/dflex/adjoint.py", line 1865, in compile
module = torch.utils.cpp_extension.load_inline('kernels',
File "/home/bigeast/anaconda3/envs/diff/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1433, in load_inline
return _jit_compile(
File "/home/bigeast/anaconda3/envs/diff/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1508, in _jit_compile
_write_ninja_file_and_build_library(
File "/home/bigeast/anaconda3/envs/diff/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1623, in _write_ninja_file_and_build_library
_run_ninja_build(
File "/home/bigeast/anaconda3/envs/diff/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1916, in _run_ninja_build
raise RuntimeError(message) from e
RuntimeError: Error building extension 'kernels' Then i found that the bug is came from the c++ complier: unsupported GNU version! gcc versions later than 8 are not supported! This means that the version of gcc installed on your system exceeds what CUDA supports, and CUDA 12.4 does not support versions higher than gcc 8. gcc --version
sudo apt install gcc-8 g++-8 2. Switch
|
Excuse me, I met such problem when I try the command
python test_env.py --env AntEnv
in the folderexamples
as the guideThe version of my Pytorch is 1.11.0, cuda is 12.1
Is there anything wrong with my system? I'll appreciate it a lot if you can help me with this problem.
The text was updated successfully, but these errors were encountered: