Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]: Generic OpenAI Embedder 404 Error with Embedding APIs (Silicon Flow, HF, Jina) #3245

Closed
CookingNoodle opened this issue Feb 17, 2025 · 1 comment
Labels
possible bug Bug was reported but is not confirmed or is unable to be replicated.

Comments

@CookingNoodle
Copy link

How are you running AnythingLLM?

Docker (remote machine)

What happened?

🐛 Bug Report

Description

When using AnythingLLM to connect to embedding APIs, including Silicon Flow, Hugging Face Inference API, and Jina AI Embedding API, configuring the "Generic OpenAI" embedder results in a 404 error "404 page not found" during document embedding. Testing the APIs directly with curl commands works fine, ruling out network issues and API key problems. This suggests that AnythingLLM's "Generic OpenAI" embedder might be using an incorrect request method (GET instead of POST) or an incorrect request URL.

Expected behavior

AnythingLLM should successfully call the configured embedding API (Silicon Flow, Hugging Face, or Jina), embed the document content, and add the document to the Workspace without 404 errors.

Actual behavior

AnythingLLM document addition fails, reporting "Generic OpenAI Failed to embed: [failed_to_embed]: 404 404 page not found" error when using Silicon Flow, Hugging Face Inference API, and Jina AI Embedding API.

Environment

  • AnythingLLM Version: latest
  • Deployment Method: Docker
  • OS: Ubuntu 22.04
  • Browser: Chrome
  • Silicon Flow/Hugging Face/Jina API works with curl?: Yes, curl commands work fine for all tested APIs.

Additional context

I am not a developer and have deployed AnythingLLM by following AI instructions. Therefore, I have limited coding knowledge and rely heavily on AI guidance for setup and troubleshooting.

I have tested the "Generic OpenAI" embedder with three different embedding APIs: Silicon Flow, Hugging Face Inference API, and Jina AI Embedding API. All three APIs result in the same 404 error in AnythingLLM, while curl commands to these APIs work correctly. This strongly suggests the issue lies within the "Generic OpenAI" embedder's request construction or handling in AnythingLLM, rather than with the API endpoints themselves or network connectivity. It is possible that the "Generic OpenAI" embedder is incorrectly using GET requests instead of POST requests, or is constructing the request URL or request body incorrectly for these APIs.

Are there known steps to reproduce?

To Reproduce

  1. In the AnythingLLM backend, configure the "Generic OpenAI" embedder with API Endpoint URLs for Silicon Flow (https://api.siliconflow.cn/v1/embeddings), Hugging Face Inference API (https://api-inference.huggingface.co/v2/embeddings), and Jina AI Embedding API (https://api.jina.ai/embeddings) respectively, and configure the correct API Keys, Model Names (e.g., BAAI/bge-large-zh-v1.5 for Silicon Flow, and relevant models for Hugging Face and Jina).
  2. Create a Workspace and select the "Generic OpenAI" embedder.
  3. Upload a document (e.g., PDF, TXT).
  4. Observe the document adding process, which fails with the error message "Error: 1 documents failed to add." and "Generic OpenAI Failed to embed: [failed_to_embed]: 404 404 page not found".
  5. Inspect the browser developer tools console (Console) to see the "Generic OpenAI Failed to embed: [failed_to_embed]: 404 404 page not found" error, but no 404 error request in the Network tab.
  6. Check the AnythingLLM server-side logs, which show the error "addDocumentToNamespace GenericOpenAI Failed to embed: [failed_to_embed]: 404 404 page not found".
@CookingNoodle CookingNoodle added the possible bug Bug was reported but is not confirmed or is unable to be replicated. label Feb 17, 2025
@timothycarambat
Copy link
Member

timothycarambat commented Feb 17, 2025

Because you are using the wrong base URL.
We use OpenAi SDK in the backend, so all you need to https://api.siliconflow.cn for the base. Same for all of the other providers. It is a 404, so your base URL is wrong - it is not a bug.

Example with SiliconFlow: #3109 (comment)

@Mintplex-Labs Mintplex-Labs locked as resolved and limited conversation to collaborators Feb 17, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
possible bug Bug was reported but is not confirmed or is unable to be replicated.
Projects
None yet
Development

No branches or pull requests

2 participants