Skip to content

Installation via pip fails for ROCm / AMD cards #646

Open
@Francesco215

Description

@Francesco215

Expected Behavior

I have a machine with and AMD GPU (Radeon RX 7900 XT). I tried to install this library as written in the README by running

CMAKE_ARGS="-DLLAMA_HIPBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python

Current Behavior

The installation fails, however when I simply run pip install llama-cpp-python it works

Environment and Context

To make the issue reproducible i made a Docker conatiner with this Dockerfile (adapted from the llama-cpp repo)

ARG UBUNTU_VERSION=22.04

# This needs to generally match the container host's environment.
ARG ROCM_VERSION=5.6however when I simply run pip install llama-cpp-python it works

# Target the CUDA build image
ARG BASE_ROCM_DEV_CONTAINER=rocm/dev-ubuntu-${UBUNTU_VERSION}:${ROCM_VERSION}-complete

FROM ${BASE_ROCM_DEV_CONTAINER} as build

# Unless otherwise specified, we make a fat build.
# List from https://github.com/ggerganov/llama.cpp/pull/1087#issuecomment-1682807878
# This is mostly tied to rocBLAS supported archs.
ARG ROCM_DOCKER_ARCH=\
    gfx803 \
    gfx900 \
    gfx906 \
    gfx908 \
    gfx90a \
    gfx1010 \
    gfx1030 \
    gfx1100 \ # this is my rocm arch
    gfx1101 \
    gfx1102

# Set nvcc architecture
ENV GPU_TARGETS=${ROCM_DOCKER_ARCH}
# Enable ROCm
ENV CC=/opt/rocm/llvm/bin/clang
ENV CXX=/opt/rocm/llvm/bin/clang++
ENV LLAMA_HIPBLAS=on

RUN apt-get update && apt-get -y install cmake protobuf-compiler aria2 git

System Info:

CPU: 13th Gen Intel(R) Core(TM) i5-13400F
GPU: Radeon RX 7900 XT

Ubuntu 22.04.1

Python 3.10.6
Make 4.3
g++ 11.3.0

Failure Information (for bugs)

The installation failed, here is the output when running CMAKE_ARGS="-DLLAMA_HIPBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python

root@8bebff5da3f1:/# CMAKE_ARGS="-DLLAMA_HIPBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python
Collecting llama-cpp-python
  Downloading llama_cpp_python-0.1.82.tar.gz (1.8 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.8/1.8 MB 3.0 MB/s eta 0:00:00
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
  Preparing metadata (pyproject.toml) ... done
Requirement already satisfied: typing-extensions>=4.5.0 in /usr/local/lib/python3.10/dist-packages (from llama-cpp-python) (4.7.1)
Requirement already satisfied: numpy>=1.20.0 in /usr/local/lib/python3.10/dist-packages (from llama-cpp-python) (1.25.2)
Collecting diskcache>=5.6.1
  Downloading diskcache-5.6.1-py3-none-any.whl (45 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 45.6/45.6 KB 7.1 MB/s eta 0:00:00
Building wheels for collected packages: llama-cpp-python
  Building wheel for llama-cpp-python (pyproject.toml) ... error
  error: subprocess-exited-with-error
  
  × Building wheel for llama-cpp-python (pyproject.toml) did not run successfully.
  │ exit code: 1
  ╰─> [390 lines of output]
      
      
      --------------------------------------------------------------------------------
      -- Trying 'Ninja' generator
      --------------------------------
      ---------------------------
      ----------------------
      -----------------
      ------------
      -------
      --
      CMake Deprecation Warning at CMakeLists.txt:1 (cmake_minimum_required):
        Compatibility with CMake < 3.5 will be removed from a future version of
        CMake.
      
        Update the VERSION argument <min> value or use a ...<max> suffix to tell
        CMake that the project does not need compatibility with older versions.
      
      Not searching for unused variables given on the command line.
      
      -- The C compiler identification is Clang 16.0.0
      -- Detecting C compiler ABI info
      -- Detecting C compiler ABI info - done
      -- Check for working C compiler: /opt/rocm/llvm/bin/clang - skipped
      -- Detecting C compile features
      -- Detecting C compile features - done
      -- The CXX compiler identification is Clang 16.0.0
      -- Detecting CXX compiler ABI info
      -- Detecting CXX compiler ABI info - done
      -- Check for working CXX compiler: /opt/rocm/llvm/bin/clang++ - skipped
      -- Detecting CXX compile features
      -- Detecting CXX compile features - done
      -- Configuring done (0.6s)
      -- Generating done (0.0s)
      -- Build files have been written to: /tmp/pip-install-wf4bikyh/llama-cpp-python_19efb6e7a69446cd9a7c7007cc342888/_cmake_test_compile/build
      --
      -------
      ------------
      -----------------
      ----------------------
      ---------------------------
      --------------------------------
      -- Trying 'Ninja' generator - success
      --------------------------------------------------------------------------------
      
      Configuring Project
        Working directory:
          /tmp/pip-install-wf4bikyh/llama-cpp-python_19efb6e7a69446cd9a7c7007cc342888/_skbuild/linux-x86_64-3.10/cmake-build
        Command:
          /tmp/pip-build-env-_3ufrfgk/overlay/local/lib/python3.10/dist-packages/cmake/data/bin/cmake /tmp/pip-install-wf4bikyh/llama-cpp-python_19efb6e7a69446cd9a7c7007cc342888 -G Ninja -DCMAKE_MAKE_PROGRAM:FILEPATH=/tmp/pip-build-env-_3ufrfgk/overlay/local/lib/python3.10/dist-packages/ninja/data/bin/ninja --no-warn-unused-cli -DCMAKE_INSTALL_PREFIX:PATH=/tmp/pip-install-wf4bikyh/llama-cpp-python_19efb6e7a69446cd9a7c7007cc342888/_skbuild/linux-x86_64-3.10/cmake-install -DPYTHON_VERSION_STRING:STRING=3.10.6 -DSKBUILD:INTERNAL=TRUE -DCMAKE_MODULE_PATH:PATH=/tmp/pip-build-env-_3ufrfgk/overlay/local/lib/python3.10/dist-packages/skbuild/resources/cmake -DPYTHON_EXECUTABLE:PATH=/usr/bin/python3 -DPYTHON_INCLUDE_DIR:PATH=/usr/include/python3.10 -DPython_EXECUTABLE:PATH=/usr/bin/python3 -DPython_ROOT_DIR:PATH=/usr -DPython_FIND_REGISTRY:STRING=NEVER -DPython_INCLUDE_DIR:PATH=/usr/include/python3.10 -DPython3_EXECUTABLE:PATH=/usr/bin/python3 -DPython3_ROOT_DIR:PATH=/usr -DPython3_FIND_REGISTRY:STRING=NEVER -DPython3_INCLUDE_DIR:PATH=/usr/include/python3.10 -DCMAKE_MAKE_PROGRAM:FILEPATH=/tmp/pip-build-env-_3ufrfgk/overlay/local/lib/python3.10/dist-packages/ninja/data/bin/ninja -DLLAMA_HIPBLAS=on -DCMAKE_BUILD_TYPE:STRING=Release -DLLAMA_HIPBLAS=on
      
      Not searching for unused variables given on the command line.
      -- The C compiler identification is Clang 16.0.0
      -- The CXX compiler identification is Clang 16.0.0
      -- Detecting C compiler ABI info
      -- Detecting C compiler ABI info - done
      -- Check for working C compiler: /opt/rocm/llvm/bin/clang - skipped
      -- Detecting C compile features
      -- Detecting C compile features - done
      -- Detecting CXX compiler ABI info
      -- Detecting CXX compiler ABI info - done
      -- Check for working CXX compiler: /opt/rocm/llvm/bin/clang++ - skipped
      -- Detecting CXX compile features
      -- Detecting CXX compile features - done
      -- Found Git: /usr/bin/git (found version "2.34.1")
      fatal: not a git repository (or any of the parent directories): .git
      fatal: not a git repository (or any of the parent directories): .git
      CMake Warning at vendor/llama.cpp/CMakeLists.txt:118 (message):
        Git repository not found; to enable automatic generation of build info,
        make sure Git is installed and the project is a Git repository.
      
      
      -- Performing Test CMAKE_HAVE_LIBC_PTHREAD
      -- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
      -- Found Threads: TRUE
      CMake Deprecation Warning at /opt/rocm/lib/cmake/hip/hip-config.cmake:20 (cmake_minimum_required):
        Compatibility with CMake < 3.5 will be removed from a future version of
        CMake.
      
        Update the VERSION argument <min> value or use a ...<max> suffix to tell
        CMake that the project does not need compatibility with older versions.
      Call Stack (most recent call first):
        vendor/llama.cpp/CMakeLists.txt:366 (find_package)
      
      
      -- hip::amdhip64 is SHARED_LIBRARY
      -- Performing Test HIP_CLANG_SUPPORTS_PARALLEL_JOBS
      -- Performing Test HIP_CLANG_SUPPORTS_PARALLEL_JOBS - Success
      CMake Deprecation Warning at /opt/rocm/lib/cmake/hip/hip-config.cmake:20 (cmake_minimum_required):
        Compatibility with CMake < 3.5 will be removed from a future version of
        CMake.
      
        Update the VERSION argument <min> value or use a ...<max> suffix to tell
        CMake that the project does not need compatibility with older versions.
      Call Stack (most recent call first):
        /tmp/pip-build-env-_3ufrfgk/overlay/local/lib/python3.10/dist-packages/cmake/data/share/cmake-3.27/Modules/CMakeFindDependencyMacro.cmake:76 (find_package)
        /opt/rocm/lib/cmake/hipblas/hipblas-config.cmake:90 (find_dependency)
        vendor/llama.cpp/CMakeLists.txt:367 (find_package)
      
      
      -- hip::amdhip64 is SHARED_LIBRARY
      -- HIP and hipBLAS found
      -- CMAKE_SYSTEM_PROCESSOR: x86_64
      -- x86 detected
      -- Configuring done (0.6s)
      -- Generating done (0.0s)
      -- Build files have been written to: /tmp/pip-install-wf4bikyh/llama-cpp-python_19efb6e7a69446cd9a7c7007cc342888/_skbuild/linux-x86_64-3.10/cmake-build
      [1/12] Building C object vendor/llama.cpp/CMakeFiles/ggml.dir/ggml-alloc.c.o
      [2/12] Building CXX object vendor/llama.cpp/common/CMakeFiles/common.dir/console.cpp.o
      [3/12] Building CXX object vendor/llama.cpp/common/CMakeFiles/common.dir/grammar-parser.cpp.o
      [4/12] Building C object vendor/llama.cpp/CMakeFiles/ggml.dir/k_quants.c.o
      /tmp/pip-install-wf4bikyh/llama-cpp-python_19efb6e7a69446cd9a7c7007cc342888/vendor/llama.cpp/k_quants.c:186:11: warning: variable 'sum_x' set but not used [-Wunused-but-set-variable]
          float sum_x = 0;
                ^
      /tmp/pip-install-wf4bikyh/llama-cpp-python_19efb6e7a69446cd9a7c7007cc342888/vendor/llama.cpp/k_quants.c:187:11: warning: variable 'sum_x2' set but not used [-Wunused-but-set-variable]
          float sum_x2 = 0;
                ^
      /tmp/pip-install-wf4bikyh/llama-cpp-python_19efb6e7a69446cd9a7c7007cc342888/vendor/llama.cpp/k_quants.c:182:14: warning: unused function 'make_qkx1_quants' [-Wunused-function]
      static float make_qkx1_quants(int n, int nmax, const float * restrict x, uint8_t * restrict L, float * restrict the_min,
                   ^
      3 warnings generated.
      [5/12] Building CXX object vendor/llama.cpp/common/CMakeFiles/common.dir/common.cpp.o
      [6/12] Building C object vendor/llama.cpp/CMakeFiles/ggml.dir/ggml.c.o
      /tmp/pip-install-wf4bikyh/llama-cpp-python_19efb6e7a69446cd9a7c7007cc342888/vendor/llama.cpp/ggml.c:2413:5: warning: implicit conversion increases floating-point precision: 'float' to 'ggml_float' (aka 'double') [-Wdouble-promotion]
          GGML_F16_VEC_REDUCE(sumf, sum);
          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      /tmp/pip-install-wf4bikyh/llama-cpp-python_19efb6e7a69446cd9a7c7007cc342888/vendor/llama.cpp/ggml.c:2045:37: note: expanded from macro 'GGML_F16_VEC_REDUCE'
      #define GGML_F16_VEC_REDUCE         GGML_F32Cx8_REDUCE
                                          ^
      /tmp/pip-install-wf4bikyh/llama-cpp-python_19efb6e7a69446cd9a7c7007cc342888/vendor/llama.cpp/ggml.c:2035:33: note: expanded from macro 'GGML_F32Cx8_REDUCE'
      #define GGML_F32Cx8_REDUCE      GGML_F32x8_REDUCE
                                      ^
      /tmp/pip-install-wf4bikyh/llama-cpp-python_19efb6e7a69446cd9a7c7007cc342888/vendor/llama.cpp/ggml.c:1981:11: note: expanded from macro 'GGML_F32x8_REDUCE'
          res = _mm_cvtss_f32(_mm_hadd_ps(t1, t1));                     \
              ~ ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      /tmp/pip-install-wf4bikyh/llama-cpp-python_19efb6e7a69446cd9a7c7007cc342888/vendor/llama.cpp/ggml.c:3456:9: warning: implicit conversion increases floating-point precision: 'float' to 'ggml_float' (aka 'double') [-Wdouble-promotion]
              GGML_F16_VEC_REDUCE(sumf[k], sum[k]);
              ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      /tmp/pip-install-wf4bikyh/llama-cpp-python_19efb6e7a69446cd9a7c7007cc342888/vendor/llama.cpp/ggml.c:2045:37: note: expanded from macro 'GGML_F16_VEC_REDUCE'
      #define GGML_F16_VEC_REDUCE         GGML_F32Cx8_REDUCE
                                          ^
      /tmp/pip-install-wf4bikyh/llama-cpp-python_19efb6e7a69446cd9a7c7007cc342888/vendor/llama.cpp/ggml.c:2035:33: note: expanded from macro 'GGML_F32Cx8_REDUCE'
      #define GGML_F32Cx8_REDUCE      GGML_F32x8_REDUCE
                                      ^
      /tmp/pip-install-wf4bikyh/llama-cpp-python_19efb6e7a69446cd9a7c7007cc342888/vendor/llama.cpp/ggml.c:1981:11: note: expanded from macro 'GGML_F32x8_REDUCE'
          res = _mm_cvtss_f32(_mm_hadd_ps(t1, t1));                     \
              ~ ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      /tmp/pip-install-wf4bikyh/llama-cpp-python_19efb6e7a69446cd9a7c7007cc342888/vendor/llama.cpp/ggml.c:596:23: warning: unused function 'mul_sum_i8_pairs' [-Wunused-function]
      static inline __m128i mul_sum_i8_pairs(const __m128i x, const __m128i y) {
                            ^
      /tmp/pip-install-wf4bikyh/llama-cpp-python_19efb6e7a69446cd9a7c7007cc342888/vendor/llama.cpp/ggml.c:627:19: warning: unused function 'hsum_i32_4' [-Wunused-function]
      static inline int hsum_i32_4(const __m128i a) {
                        ^
      /tmp/pip-install-wf4bikyh/llama-cpp-python_19efb6e7a69446cd9a7c7007cc342888/vendor/llama.cpp/ggml.c:692:23: warning: unused function 'packNibbles' [-Wunused-function]
      static inline __m128i packNibbles( __m256i bytes )
                            ^
      5 warnings generated.
      [7/12] Linking C static library vendor/llama.cpp/libggml_static.a
      [8/12] Building CXX object vendor/llama.cpp/CMakeFiles/llama.dir/llama.cpp.o
      [9/12] Building CXX object vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      [10/12] Linking CXX shared library vendor/llama.cpp/libggml_shared.so
      FAILED: vendor/llama.cpp/libggml_shared.so
      : && /opt/rocm/llvm/bin/clang++ -fPIC -O3 -DNDEBUG   -shared -Wl,-soname,libggml_shared.so -o vendor/llama.cpp/libggml_shared.so vendor/llama.cpp/CMakeFiles/ggml.dir/ggml.c.o vendor/llama.cpp/CMakeFiles/ggml.dir/ggml-alloc.c.o vendor/llama.cpp/CMakeFiles/ggml.dir/k_quants.c.o vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o  -Wl,-rpath,/opt/rocm-5.6.0/lib:/opt/rocm/lib:  --hip-link  --offload-arch=gfx900  --offload-arch=gfx906  --offload-arch=gfx908  --offload-arch=gfx90a  --offload-arch=gfx1030  /opt/rocm-5.6.0/lib/libhipblas.so.1.0.50600  /opt/rocm-5.6.0/lib/librocblas.so.3.0.50600  /opt/rocm-5.6.0/llvm/lib/clang/16.0.0/lib/linux/libclang_rt.builtins-x86_64.a  /opt/rocm/lib/libamdhip64.so.5.6.50600 && :
      ld.lld: error: relocation R_X86_64_32 cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
      
      ld.lld: error: relocation R_X86_64_32 cannot be used against symbol '__gxx_personality_v0'; recompile with -fPIC
      >>> defined in /usr/lib/gcc/x86_64-linux-gnu/11/libstdc++.so
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(.eh_frame+0x7343)
      
      ld.lld: error: relocation R_X86_64_PC32 cannot be used against symbol 'stderr'; recompile with -fPIC
      >>> defined in /lib/x86_64-linux-gnu/libc.so.6
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
      
      ld.lld: error: relocation R_X86_64_32 cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
      
      ld.lld: error: relocation R_X86_64_32 cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
      
      ld.lld: error: relocation R_X86_64_32 cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(.eh_frame+0x7361)
      
      ld.lld: error: relocation R_X86_64_PC32 cannot be used against symbol 'stderr'; recompile with -fPIC
      >>> defined in /lib/x86_64-linux-gnu/libc.so.6
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
      
      ld.lld: error: relocation R_X86_64_32 cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
      
      ld.lld: error: relocation R_X86_64_32S cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
      
      ld.lld: error: relocation R_X86_64_32 cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(.eh_frame+0x7399)
      
      ld.lld: error: relocation R_X86_64_32S cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
      
      ld.lld: error: relocation R_X86_64_32S cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
      
      ld.lld: error: relocation R_X86_64_32S cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
      
      ld.lld: error: relocation R_X86_64_32S cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
      llama-cpp-python/vendor/llama.cpp$ git log | head -3
commit 66874d4fbcc7866377246efbcee938e8cc9c7d76
Author: Kerfuffle <[email protected]>
Date:   Thu May 25 20:18:01 2023 -0600
      ld.lld: error: relocation R_X86_64_32S cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
      
      ld.lld: error: relocation R_X86_64_32 cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
      
      ld.lld: error: relocation R_X86_64_32 cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
      
      ld.lld: error: relocation R_X86_64_32 cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
      
      ld.lld: error: relocation R_X86_64_32 cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
      
      ld.lld: error: relocation R_X86_64_PC32 cannot be used against symbol 'stderr'; recompile with -fPIC
      >>> defined in /lib/x86_64-linux-gnu/libc.so.6
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
      
      ld.lld: error: too many errors emitted, stopping now (use --error-limit=0 to see all errors)
      clang-16: error: linker command failed with exit code 1 (use -v to see invocation)
      [11/12] Linking CXX shared library vendor/llama.cpp/libllama.so
      FAILED: vendor/llama.cpp/libllama.so
      : && /opt/rocm/llvm/bin/clang++ -fPIC -O3 -DNDEBUG   -shared -Wl,-soname,libllama.so -o vendor/llama.cpp/libllama.so vendor/llama.cpp/CMakeFiles/ggml.dir/ggml.c.o vendor/llama.cpp/CMakeFiles/ggml.dir/ggml-alloc.c.o vendor/llama.cpp/CMakeFiles/ggml.dir/k_quants.c.o vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o vendor/llama.cpp/CMakeFiles/llama.dir/llama.cpp.o  -Wl,-rpath,/opt/rocm-5.6.0/lib:/opt/rocm/lib:  --hip-link  --offload-arch=gfx900  --offload-arch=gfx906  --offload-arch=gfx908  --offload-arch=gfx90a  --offload-arch=gfx1030  /opt/rocm-5.6.0/lib/libhipblas.so.1.0.50600  /opt/rocm-5.6.0/lib/librocblas.so.3.0.50600  /opt/rocm-5.6.0/llvm/lib/clang/16.0.0/lib/linux/libclang_rt.builtins-x86_64.a  /opt/rocm/lib/libamdhip64.so.5.6.50600 && :
      ld.lld: error: relocation R_X86_64_32 cannot be used against symbol '__gxx_personality_v0'; recompile with -fPIC
      >>> defined in /usr/lib/gcc/x86_64-linux-gnu/11/libstdc++.so
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(.eh_frame+0x9E3F)
      
      ld.lld: error: relocation R_X86_64_32 cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
      
      ld.lld: error: relocation R_X86_64_32 cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(.eh_frame+0x9E5D)
      
      ld.lld: error: relocation R_X86_64_32 cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(.eh_frame+0x9E95)
      
      ld.lld: error: relocation R_X86_64_PC32 cannot be used against symbol 'stderr'; recompile with -fPIC
      >>> defined in /lib/x86_64-linux-gnu/libc.so.6
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
      
      ld.lld: error: relocation R_X86_64_32 cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
      
      ld.lld: error: relocation R_X86_64_32 cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
      
      ld.lld: error: relocation R_X86_64_PC32 cannot be used against symbol 'stderr'; recompile with -fPIC
      >>> defined in /lib/x86_64-linux-gnu/libc.so.6
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
      
      ld.lld: error: relocation R_X86_64_32 cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
      
      ld.lld: error: relocation R_X86_64_32S cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
      
      ld.lld: error: relocation R_X86_64_32S cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
      
      ld.lld: error: relocation R_X86_64_32S cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
      
      ld.lld: error: relocation R_X86_64_32S cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
      
      ld.lld: error: relocation R_X86_64_32S cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
      
      ld.lld: error: relocation R_X86_64_32S cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
      
      ld.lld: error: relocation R_X86_64_32 cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
      
      ld.lld: error: relocation R_X86_64_32 cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
      
      ld.lld: error: relocation R_X86_64_32 cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
      
      ld.lld: error: relocation R_X86_64_32 cannot be used against local symbol; recompile with -fPIC
      >>> defined in vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
      
      ld.lld: error: relocation R_X86_64_PC32 cannot be used against symbol 'stderr'; recompile with -fPIC
      >>> defined in /lib/x86_64-linux-gnu/libc.so.6
      >>> referenced by ggml-cuda.cu
      >>>               vendor/llama.cpp/CMakeFiles/ggml-rocm.dir/ggml-cuda.cu.o:(ggml_init_cublas)
      
      ld.lld: error: too many errors emitted, stopping now (use --error-limit=0 to see all errors)
      clang-16: error: linker command failed with exit code 1 (use -v to see invocation)
      ninja: build stopped: subcommand failed.
      Traceback (most recent call last):
        File "/tmp/pip-build-env-_3ufrfgk/overlay/local/lib/python3.10/dist-packages/skbuild/setuptools_wrap.py", line 674, in setup
          cmkr.make(make_args, install_target=cmake_install_target, env=env)
        File "/tmp/pip-build-env-_3ufrfgk/overlay/local/lib/python3.10/dist-packages/skbuild/cmaker.py", line 697, in make
          self.make_impl(clargs=clargs, config=config, source_dir=source_dir, install_target=install_target, env=env)
        File "/tmp/pip-build-env-_3ufrfgk/overlay/local/lib/python3.10/dist-packages/skbuild/cmaker.py", line 742, in make_impl
          raise SKBuildError(msg)
      
      An error occurred while building with CMake.
        Command:
          /tmp/pip-build-env-_3ufrfgk/overlay/local/lib/python3.10/dist-packages/cmake/data/bin/cmake --build . --target install --config Release --
        Install target:
          install
        Source directory:
          /tmp/pip-install-wf4bikyh/llama-cpp-python_19efb6e7a69446cd9a7c7007cc342888
        Working directory:
          /tmp/pip-install-wf4bikyh/llama-cpp-python_19efb6e7a69446cd9a7c7007cc342888/_skbuild/linux-x86_64-3.10/cmake-build
      Please check the install target is valid and see CMake's output for more information.
      
      [end of output]
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for llama-cpp-python
Failed to build llama-cpp-python
ERROR: Could not build wheels for llama-cpp-python, which is required to install pyproject.toml-based projects

As reference here is what happens when i simpy run pip install llama-cpp-python

pip install llama-cpp-python
Collecting llama-cpp-python
  Using cached llama_cpp_python-0.1.82.tar.gz (1.8 MB)
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
  Preparing metadata (pyproject.toml) ... done
Requirement already satisfied: numpy>=1.20.0 in /usr/local/lib/python3.10/dist-packages (from llama-cpp-python) (1.25.2)
Collecting diskcache>=5.6.1
  Using cached diskcache-5.6.1-py3-none-any.whl (45 kB)
Requirement already satisfied: typing-extensions>=4.5.0 in /usr/local/lib/python3.10/dist-packages (from llama-cpp-python) (4.7.1)
Building wheels for collected packages: llama-cpp-python
  Building wheel for llama-cpp-python (pyproject.toml) ... done
  Created wheel for llama-cpp-python: filename=llama_cpp_python-0.1.82-cp310-cp310-linux_x86_64.whl size=593844 sha256=5523b29af1720e7931b4ca3caee8ebb65b502a8640db4f1e6a633eb7d444dff5
  Stored in directory: /root/.cache/pip/wheels/d5/5a/02/e3a3e540045da967de35d1ac2220a194e26e57b120bb46b466
Successfully built llama-cpp-python
Installing collected packages: diskcache, llama-cpp-python
Successfully installed diskcache-5.6.1 llama-cpp-python-0.1.82
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv

After installation with this second method the code runs as expected and it utilizes the gpu

Steps to Reproduce

Make sure you have an AMD GPU

  1. Build a docker container with the dockerfile written above docker build --pull --rm -f "Dockerfile" -t llama-cpp-python-container:latest
  2. Run it docker run -it --device=/dev/kfd --device=/dev/dri llama-cpp-python-container bash
  3. try the two intallation methods CMAKE_ARGS="-DLLAMA_HIPBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python and pip install llama-cpp-python

Failure Logs

Environment info

llama-cpp-python$ pip list | egrep "uvicorn|fastapi|sse-starlette|numpy"
numpy              1.25.2

I'm not sure where to get the llama-cpp version

Metadata

Metadata

Assignees

No one assigned

    Labels

    buildllama.cppProblem with llama.cpp shared lib

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions