Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🐛 [Bug]: DLLAMA_BLAS_VENDOR=OpenBLAS build with pip is not enabling OpenBlas #977

Closed
2 tasks done
amgowda-oci opened this issue Dec 13, 2023 · 3 comments
Closed
2 tasks done

Comments

@amgowda-oci
Copy link

Bug description

I have tried pp install as detailed with OpenBlas installed CMAKE_ARGS="-DLLAMA_BLAS=ON -DLLAMA_BLAS_VENDOR=OpenBLAS" pip install -v llama-cpp-python but the llama.cpp built does not have Open blas set to true, somewhere when pip build llama-python it is not passing this command down to llama. cpp

serge-serge-1 | AVX = 0 | AVX2 = 0 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 0 | NEON = 1 | ARM_FMA = 1 | F16C = 0 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 0 | SSSE3 = 0 | VSX = 0 |
serge-serge-1 | INFO: 172.18.0.2:45058 - "POST /chat/?temperature=0.1&top_k=50&max_length=2048&top_p=0.95&context_window=256&gpu_layers=0&repeat_last_n=64&model=Mistral-7B&n_threads=40&repeat_penalty=1.3&init_prompt=Below+is+an+instruction+that+describes+a+task.+Write+a+response+that+appropriately+completes+the+request. HTTP/1.1" 200 OK

Steps to reproduce

  1. Install openblas-dev to your linux OS/Docker container
  2. Run - CMAKE_ARGS="-DLLAMA_BLAS=ON -DLLAMA_BLAS_VENDOR=OpenBLAS" pip install -v llama-cpp-python
  3. Download any model
  4. Run the ./api or llama-python
  5. You will see in the logs of serge that llala.cpp loading the ML model without OpenBlas
    results
image

Environment Information

---------------------------------------

Base image for node

FROM node:20-bookworm-slim as node_base

---------------------------------------

Base image for redis

FROM redis:7-bookworm as redis

---------------------------------------

Dev environment

FROM python:3.11-slim-bookworm as dev

Set ENV

WORKDIR /usr/src/app
ENV TZ=Etc/UTC
ENV NODE_ENV='development'

Install dependencies

RUN apt-get update
&& apt-get install -y --no-install-recommends dumb-init && apt-get install -y libopenblas-dev && pip install --upgrade pip

Copy database, source code, and scripts

COPY --from=redis /usr/local/bin/redis-server /usr/local/bin/redis-server
COPY --from=redis /usr/local/bin/redis-cli /usr/local/bin/redis-cli
COPY --from=node_base /usr/local /usr/local
COPY scripts/dev.sh /usr/src/app/dev.sh
COPY scripts/serge.env /usr/src/app/serge.env
COPY vendor/requirements.txt /usr/src/app/requirements.txt
COPY ./web/package.json ./web/package-lock.json ./

RUN npm ci
&& chmod 755 /usr/src/app/dev.sh
&& chmod 755 /usr/local/bin/redis-server
&& chmod 755 /usr/local/bin/redis-cli
&& mkdir -p /etc/redis
&& mkdir -p /data/db
&& mkdir -p /usr/src/app/weights
&& echo "appendonly yes" >> /etc/redis/redis.conf
&& echo "dir /data/db/" >> /etc/redis/redis.conf

EXPOSE 8008
EXPOSE 9124
ENTRYPOINT ["/usr/bin/dumb-init", "--"]
CMD ["/bin/bash", "-c", "/usr/src/app/dev.sh"]

DEV.SH

#!/bin/bash

set -x
source serge.env

Get CPU Architecture

cpu_arch=$(uname -m)

build with OpenBLAS=1

blasconfig="CMAKE_ARGS="-DLLAMA_BLAS=ON -DLLAMA_BLAS_VENDOR=OpenBLAS""

Function to detect CPU features

detect_cpu_features() {
cpu_info=$(lscpu)
if echo "$cpu_info" | grep -q "avx512"; then
echo "AVX512"
elif echo "$cpu_info" | grep -q "avx2"; then
echo "AVX2"
elif echo "$cpu_info" | grep -q "avx"; then
echo "AVX"
else
echo "basic"
fi
}

Check if the CPU architecture is aarch64/arm64

if [ "$cpu_arch" = "aarch64" ]; then
pip_command="$blasconfig pip install -v llama-cpp-python==$LLAMA_PYTHON_VERSION --only-binary=:all: --extra-index-url=https://gaby.github.io/arm64-wheels/"
else
# Use @jllllll provided wheels
cpu_feature=$(detect_cpu_features)
pip_command="$blasconfigpip pip install -v llama-cpp-python==$LLAMA_PYTHON_VERSION --only-binary=:all: --extra-index-url=https://jllllll.github.io/llama-cpp-python-cuBLAS-wheels/$cpu_feature/cpu"
fi

echo "Recommended install command for llama-cpp-python: $pip_command"

Install python vendor dependencies

pip install -r /usr/src/app/requirements.txt || {
echo 'Failed to install python dependencies from requirements.txt'
exit 1
}

Install python dependencies

pip install -e ./api || {
echo 'Failed to install python dependencies'
exit 1
}

Install python bindings

eval "$pip_command" || {
echo 'Failed to install llama-cpp-python'
exit 1
}

Start Redis instance

redis-server /etc/redis/redis.conf &

Start the web server

cd /usr/src/app/web || exit 1
npm run dev -- --host 0.0.0.0 --port 8008 &

Start the API

cd /usr/src/app/api || exit 1
uvicorn src.serge.main:api_app --reload --host 0.0.0.0 --port 9124 --root-path /api/ || {
echo 'Failed to start main app'
exit 1
}

Screenshots

No response

Relevant log output

No response

Confirmations

  • I'm running the latest version of the main branch.
  • I checked existing issues to see if this has already been described.
@gaby
Copy link
Member

gaby commented Dec 13, 2023

@amgowda-oci Serge doesnt support gpu yet. Work is being done in #944

@gaby gaby closed this as completed Dec 13, 2023
@amgowda-oci
Copy link
Author

@gaby this is using OpenBLAS where there is no GPU dependency

@gaby
Copy link
Member

gaby commented Dec 13, 2023

@amgowda-oci I dont know then. Ask the llama-cpp-python team. We are just installing it using pip. Sounds like a problem on their side

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants