Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chatUI requests the llamacpp server by a hardcoded address #1303

Open
andreys42 opened this issue Jun 25, 2024 · 1 comment
Open

chatUI requests the llamacpp server by a hardcoded address #1303

andreys42 opened this issue Jun 25, 2024 · 1 comment

Comments

@andreys42
Copy link

andreys42 commented Jun 25, 2024

I am using the chatui instance as pod in k8s cluster with following params:

envVars:
  MONGODB_URL: mongodb://chatui-mongodb:27017
  HF_TOKEN: -----------------------------------
  MODELS: '[
    {
      "name": "Meta-Llama-3-8B-Instruct-q5_k_m.gguf",
      "endpoints": [{
        "type" : "llamacpp",
        "baseURL": "http://llama-cpp-server:8000"
      }],
    },
  ]'

llamacpp is launched with following params:

containers:
            - name: llama-cpp-server
              image: ghcr.io/ggerganov/llama.cpp:server-cuda
              imagePullPolicy: IfNotPresent
              args: ["-m", "/models/Meta-Llama-3-8B-Instruct-q5_k_m.gguf", "--port", "8000", "--host", "0.0.0.0", "-n", "512", "--n-gpu-layers", "1"]

Even though baseURL ": "http://llama-cpp-server:8000", so chatUI should request this address, I have following error when trying to prompt something on frontend:

{"level":50,"time":1719306384776,"pid":22,"hostname":"chatui-7f75bfc479-zfqbg","err":{"type":"TypeError","message":"fetch failed: ","stack":"TypeError: fetch failed\n    at fetch (file:///app/build/shims.js:20346:13)\n    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)\n    at async Promise.all (index 1)\n    at async Promise.all (index 0)\n    at async file:///app/build/server/chunks/index3-1a1c67bb.js:388:17\ncaused by: AggregateError [ECONNREFUSED]: \n    at internalConnectMultiple (node:net:1117:18)\n    at internalConnectMultiple (node:net:1185:5)\n    at afterConnectMultiple (node:net:1684:7)"},"msg":"Failed to initialize PlaywrightBlocker from prebuilt lists"}
{"level":50,"time":1719306387338,"pid":22,"hostname":"chatui-7f75bfc479-zfqbg","err":{"type":"TypeError","message":"fetch failed: connect ECONNREFUSED 127.0.0.1:8080","stack":"TypeError: fetch failed\n    at fetch (file:///app/build/shims.js:20346:13)\n    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)\n    at async file:///app/build/server/chunks/models-fc8a6ecf.js:98900:15\n    at async generateFromDefaultEndpoint (file:///app/build/server/chunks/index3-1a1c67bb.js:213:23)\n    at async generateTitle (file:///app/build/server/chunks/_server.ts-3da96c1b.js:213:10)\n    at async generateTitleForConversation (file:///app/build/server/chunks/_server.ts-3da96c1b.js:177:19)\ncaused by: Error: connect ECONNREFUSED 127.0.0.1:8080\n    at TCPConnectWrap.afterConnect [as oncomplete] (node:net:1606:16)"},"msg":"fetch failed"}
TypeError: fetch failed
    at fetch (file:///app/build/shims.js:20346:13)
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
    at async file:///app/build/server/chunks/models-fc8a6ecf.js:98900:15
    at async generate (file:///app/build/server/chunks/_server.ts-3da96c1b.js:426:30)
    at async textGenerationWithoutTitle (file:///app/build/server/chunks/_server.ts-3da96c1b.js:487:3) {
  cause: Error: connect ECONNREFUSED 127.0.0.1:8080
      at TCPConnectWrap.afterConnect [as oncomplete] (node:net:1606:16) {
    errno: -111,
    code: 'ECONNREFUSED',
    syscall: 'connect',
    address: '127.0.0.1',
    port: 8080
  }
}

AFAIK this means that chaui requests some predefined (or hardcoded) URL - 127.0.0.1:8080 to get predictions.
My humble opinion is that chatUI ignores baseURL if type is llamacpp and uses 127.0.0.1:8080 as default one
Is it predictable behaviour?

ps: My guess is that problem is here

@nsarrazin
Copy link
Collaborator

Hi! Thanks for the report. I think the llama.cpp endpoint type was using a parameter named url instead of baseURL.

I implemented a fix here and will let you know once it's deployed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants