Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FR Support local server for embeddings #645

Open
ArtificialAmateur opened this issue Jun 9, 2024 · 23 comments
Open

FR Support local server for embeddings #645

ArtificialAmateur opened this issue Jun 9, 2024 · 23 comments
Labels
enhancement New feature or request

Comments

@ArtificialAmateur
Copy link

Jumping off of #302

Like the local server options for Smart Chat, similar work can be done for embeddings.

The OpenAI format API (which LM Studio and Ollama support) is /v1/embeddings

@daaain
Copy link

daaain commented Jun 30, 2024

I'd love this, the embedded WASM models don't seem to saturate the CPU / GPU so it takes ages...

@brianpetro brianpetro changed the title Feature Request: Support local server for embeddings FR Support local server for embeddings Jul 5, 2024
@brianpetro brianpetro added enhancement New feature or request and removed needs review labels Jul 5, 2024
@brianpetro
Copy link
Owner

Makes sense. Thanks for the feature request 😊🌴

@jagai
Copy link

jagai commented Sep 24, 2024

@daaain @ArtificialAmateur While this isn't an ideal solution, I did manage to set up gaianet/Nomic-embed-text-v1.5-Embedding-GGUF/nomic-embed-text-v1.5.f16.gguf on LM Studio to work in @brianpetro incredible smart-connections plugin.

What I've done is essentially eliminate the checks on api.openai.com and instead just refactored it to direct to my local LM Studio server. Just beware that by doing so, you're taking away the ability to use the OpenAI embeddings, because we'll be refactoring the components that connect to them rather than adding on existing functionalities.

This is a quick and dirty fix for those who'd rather handle the embeddings locally, and its far from ideal, but it works really great for my use case.

Enjoy!

Instructions

On main.js, I've refactored as follows:

Before:

var SmartEmbedOpenAIAdapter = class extends SmartEmbedAdapter {
  constructor(smart_embed) {
    super(smart_embed);
    this.model_key = smart_embed.opts.model_key || "text-embedding-ada-002";
    this.endpoint = "https://api.openai.com/v1/embeddings";
    this.max_tokens = 8191;
    this.dims = smart_embed.opts.dims || 1536;
    this.enc = null;
    this.request_adapter = smart_embed.env.opts.request_adapter;
  }

After:

var SmartEmbedOpenAIAdapter = class extends SmartEmbedAdapter {
  constructor(smart_embed) {
    super(smart_embed);
    this.model_key = smart_embed.opts.model_key;
    this.endpoint = "http://127.0.0.1:1234/v1/embeddings";
    this.max_tokens = 2048;
    this.enc = null;
    this.request_adapter = smart_embed.env.opts.request_adapter;
  }

On var models_default, I've added after Xenova/jina-embeddings-v2-base-zh the model to be selectable on the smart-connections plugin

  "gaianet/Nomic-embed-text-v1.5-Embedding-GGUF/nomic-embed-text-v1.5.f16.gguf": {
    id: "gaianet/Nomic-embed-text-v1.5-Embedding-GGUF/nomic-embed-text-v1.5.f16.gguf",
    batch_size: 1,
    dims: 512,
    max_tokens: 2048,
    name: "LLM Studio Nomic",
    description: "API, 2,048 tokens, 512 dim",
    endpoint: "http://127.0.0.1:1234/v1/embeddings",
    adapter: "openai"
  },

On var transformers_connector, I've added the JSON to the list of transformers, I'll save you the trouble and provide the entire list to replace.

var transformers_connector = '// models.json\nvar models_default = {\n "TaylorAI/bge-micro-v2": {\n model_key: "TaylorAI/bge-micro-v2",\n batch_size: 1,\n dims: 384,\n max_tokens: 512,\n name: "BGE-micro-v2",\n description: "Local, 512 tokens, 384 dim",\n adapter: "transformers"\n },\n "andersonbcdefg/bge-small-4096": {\n model_key: "andersonbcdefg/bge-small-4096",\n batch_size: 1,\n dims: 384,\n max_tokens: 4096,\n name: "BGE-small-4K",\n description: "Local, 4,096 tokens, 384 dim",\n adapter: "transformers"\n },\n "Xenova/jina-embeddings-v2-base-zh": {\n model_key: "Xenova/jina-embeddings-v2-base-zh",\n batch_size: 1,\n dims: 512,\n max_tokens: 8192,\n name: "Jina-v2-base-zh-8K",\n description: "Local, 8,192 tokens, 512 dim, Chinese/English bilingual",\n adapter: "transformers"\n },\n "gaianet/Nomic-embed-text-v1.5-Embedding-GGUF/nomic-embed-text-v1.5.f16.gguf": {\n id: "gaianet/Nomic-embed-text-v1.5-Embedding-GGUF/nomic-embed-text-v1.5.f16.gguf",\n batch_size: 1,\n dims: 512,\n max_tokens: 2048,\n name: "LLM Studio Nomic", \n description: "API, 2,048 tokens, 512 dim", \n endpoint: "http://127.0.0.1:1234/v1/embeddings",\n adapter: "openai"\n },\n "text-embedding-3-small": {\n model_key: "text-embedding-3-small",\n batch_size: 50,\n dims: 1536,\n max_tokens: 8191,\n name: "OpenAI Text-3 Small",\n description: "API, 8,191 tokens, 1,536 dim",\n endpoint: "https://api.openai.com/v1/embeddings",\n adapter: "openai"\n },\n "text-embedding-3-large": {\n model_key: "text-embedding-3-large",\n batch_size: 50,\n dims: 3072,\n max_tokens: 8191,\n name: "OpenAI Text-3 Large",\n description: "API, 8,191 tokens, 3,072 dim",\n endpoint: "https://api.openai.com/v1/embeddings",\n adapter: "openai"\n },\n "text-embedding-3-small-512": {\n model_key: "text-embedding-3-small",\n batch_size: 50,\n dims: 512,\n max_tokens: 8191,\n name: "OpenAI Text-3 Small - 512",\n description: "API, 8,191 tokens, 512 dim",\n endpoint: "https://api.openai.com/v1/embeddings",\n adapter: "openai"\n },\n "text-embedding-3-large-256": {\n model_key: "text-embedding-3-large",\n batch_size: 50,\n dims: 256,\n max_tokens: 8191,\n name: "OpenAI Text-3 Large - 256",\n description: "API, 8,191 tokens, 256 dim",\n endpoint: "https://api.openai.com/v1/embeddings",\n adapter: "openai"\n },\n "text-embedding-ada-002": {\n model_key: "text-embedding-ada-002",\n batch_size: 50,\n dims: 1536,\n max_tokens: 8191,\n name: "OpenAI Ada",\n description: "API, 8,191 tokens, 1,536 dim",\n endpoint: "https://api.openai.com/v1/embeddings",\n adapter: "openai"\n },\n "Xenova/jina-embeddings-v2-small-en": {\n model_key: "Xenova/jina-embeddings-v2-small-en",\n batch_size: 1,\n dims: 512,\n max_tokens: 8192,\n name: "Jina-v2-small-en",\n description: "Local, 8,192 tokens, 512 dim",\n adapter: "transformers"\n },\n "nomic-ai/nomic-embed-text-v1.5": {\n model_key: "nomic-ai/nomic-embed-text-v1.5",\n batch_size: 1,\n dims: 256,\n max_tokens: 8192,\n name: "Nomic-embed-text-v1.5",\n description: "Local, 8,192 tokens, 256 dim",\n adapter: "transformers"\n },\n "Xenova/bge-small-en-v1.5": {\n model_key: "Xenova/bge-small-en-v1.5",\n batch_size: 1,\n dims: 384,\n max_tokens: 512,\n name: "BGE-small",\n description: "Local, 512 tokens, 384 dim",\n adapter: "transformers"\n },\n "nomic-ai/nomic-embed-text-v1": {\n model_key: "nomic-ai/nomic-embed-text-v1",\n batch_size: 1,\n dims: 768,\n max_tokens: 2048,\n name: "Nomic-embed-text",\n description: "Local, 2,048 tokens, 768 dim",\n adapter: "transformers"\n }\n};\n\n// smart_embed_model.js\nvar SmartEmbedModel = class _SmartEmbedModel {\n /**\n * Create a SmartEmbed instance.\n * @param {string} env - The environment to use.\n * @param {object} opts - Full model configuration object or at least a model_key and adapter\n */\n constructor(env, opts = {}) {\n this.env = env;\n this.opts = {\n ...models_default[opts.embed_model_key],\n ...opts\n };\n console.log(this.opts);\n if (!this.opts.adapter)\n return console.warn("SmartEmbedModel adapter not set");\n if (!this.env.opts.smart_embed_adapters[this.opts.adapter])\n return console.warn(SmartEmbedModel adapter ${this.opts.adapter} not found);\n this.opts.use_gpu = !!navigator.gpu && this.opts.gpu_batch_size !== 0;\n if (this.opts.adapter === "transformers" && this.opts.use_gpu)\n this.opts.batch_size = this.opts.gpu_batch_size || 10;\n this.adapter = new this.env.opts.smart_embed_adapters[this.opts.adapter](this);\n }\n /**\n * Used to load a model with a given configuration.\n * @param {*} env \n * @param {*} opts \n */\n static async load(env, opts = {}) {\n try {\n const model2 = new _SmartEmbedModel(env, opts);\n await model2.adapter.load();\n env.smart_embed_active_models[opts.embed_model_key] = model2;\n return model2;\n } catch (error) {\n console.error(Error loading model ${opts.model_key}:, error);\n return null;\n }\n }\n /**\n * Count the number of tokens in the input string.\n * @param {string} input - The input string to process.\n * @returns {Promise<number>} A promise that resolves with the number of tokens.\n */\n async count_tokens(input) {\n return this.adapter.count_tokens(input);\n }\n /**\n * Embed the input into a numerical array.\n * @param {string|Object} input - The input to embed. Can be a string or an object with an "embed_input" property.\n * @returns {Promise<Object>} A promise that resolves with an object containing the embedding vector at vecand the number of tokens attokens.\n */\n async embed(input) {\n if (typeof input === "string")\n input = { embed_input: input };\n return (await this.embed_batch([input]))[0];\n }\n /**\n * Embed a batch of inputs into arrays of numerical arrays.\n * @param {Array<string|Object>} inputs - The array of inputs to embed. Each input can be a string or an object with an "embed_input" property.\n * @returns {Promise<Array<Object>>} A promise that resolves with an array of objects containing vecandtokens properties.\n */\n async embed_batch(inputs) {\n return await this.adapter.embed_batch(inputs);\n }\n get batch_size() {\n return this.opts.batch_size || 1;\n }\n get max_tokens() {\n return this.opts.max_tokens || 512;\n }\n};\n\n// adapters/_adapter.js\nvar SmartEmbedAdapter = class {\n constructor(smart_embed) {\n this.smart_embed = smart_embed;\n }\n async load() {\n throw new Error("Not implemented");\n }\n async count_tokens(input) {\n throw new Error("Not implemented");\n }\n async embed(input) {\n throw new Error("Not implemented");\n }\n async embed_batch(input) {\n throw new Error("Not implemented");\n }\n};\n\n// adapters/transformers.js\nvar SmartEmbedTransformersAdapter = class extends SmartEmbedAdapter {\n constructor(smart_embed) {\n super(smart_embed);\n this.model = null;\n this.tokenizer = null;\n }\n get batch_size() {\n if (this.use_gpu && this.smart_embed.opts.gpu_batch_size)\n return this.smart_embed.opts.gpu_batch_size;\n return this.smart_embed.opts.batch_size || 1;\n }\n get max_tokens() {\n return this.smart_embed.opts.max_tokens || 512;\n }\n get use_gpu() {\n return this.smart_embed.opts.use_gpu || false;\n }\n async load() {\n const { pipeline, env, AutoTokenizer } = await import("@xenova/transformers");\n env.allowLocalModels = false;\n const pipeline_opts = {\n quantized: true\n };\n if (this.use_gpu) {\n console.log("[Transformers] Using GPU");\n pipeline_opts.device = "webgpu";\n pipeline_opts.dtype = "fp32";\n } else {\n console.log("[Transformers] Using CPU");\n env.backends.onnx.wasm.numThreads = 8;\n }\n this.model = await pipeline("feature-extraction", this.smart_embed.opts.model_key, pipeline_opts);\n this.tokenizer = await AutoTokenizer.from_pretrained(this.smart_embed.opts.model_key);\n }\n async count_tokens(input) {\n if (!this.tokenizer)\n await this.load();\n const { input_ids } = await this.tokenizer(input);\n return { tokens: input_ids.data.length };\n }\n async embed_batch(inputs) {\n if (!this.model)\n await this.load();\n const filtered_inputs = inputs.filter((item) => item.embed_input?.length > 0);\n if (!filtered_inputs.length)\n return [];\n if (filtered_inputs.length > this.batch_size) {\n throw new Error(Input size (${filtered_inputs.length}) exceeds maximum batch size (${this.batch_size}));\n }\n const tokens = await Promise.all(filtered_inputs.map((item) => this.count_tokens(item.embed_input)));\n const embed_inputs = await Promise.all(filtered_inputs.map(async (item, i) => {\n if (tokens[i].tokens < this.max_tokens)\n return item.embed_input;\n let token_ct = tokens[i].tokens;\n let truncated_input = item.embed_input;\n while (token_ct > this.max_tokens) {\n const pct = this.max_tokens / token_ct;\n const max_chars = Math.floor(truncated_input.length * pct * 0.9);\n truncated_input = truncated_input.substring(0, max_chars) + "...";\n token_ct = (await this.count_tokens(truncated_input)).tokens;\n }\n tokens[i].tokens = token_ct;\n return truncated_input;\n }));\n try {\n const resp = await this.model(embed_inputs, { pooling: "mean", normalize: true });\n return filtered_inputs.map((item, i) => {\n item.vec = Array.from(resp[i].data).map((val) => Math.round(val * 1e8) / 1e8);\n item.tokens = tokens[i].tokens;\n return item;\n });\n } catch (err) {\n console.error("error_embedding_batch", err);\n return Promise.all(filtered_inputs.map((item) => this.embed(item.embed_input)));\n }\n }\n};\n\n// build/transformers_iframe_script.js\nvar model = null;\nvar smart_env = {\n smart_embed_active_models: {},\n opts: {\n smart_embed_adapters: {\n transformers: SmartEmbedTransformersAdapter\n }\n }\n};\nasync function processMessage(data) {\n const { method, params, id, iframe_id } = data;\n try {\n let result;\n switch (method) {\n case "init":\n console.log("init");\n break;\n case "load":\n console.log("load", params);\n model = await SmartEmbedModel.load(smart_env, { adapter: "transformers", model_key: params.model_key, ...params });\n result = { model_loaded: true };\n break;\n case "embed_batch":\n if (!model)\n throw new Error("Model not loaded");\n result = await model.embed_batch(params.inputs);\n break;\n case "count_tokens":\n if (!model)\n throw new Error("Model not loaded");\n result = await model.count_tokens(params);\n break;\n default:\n throw new Error(Unknown method: ${method});\n }\n return { id, result, iframe_id };\n } catch (error) {\n console.error("Error processing message:", error);\n return { id, error: error.message, iframe_id };\n }\n}\nprocessMessage({ method: "init" });\n';

@brianpetro
Copy link
Owner

@jagai thanks for sharing this 😊

PS- It will be easier to configure something like this without code in the future.

🌴

@usernotnull
Copy link

@daaain @ArtificialAmateur While this isn't an ideal solution, I did manage to set up nomic-ai/nomic-embed-text-v1.5-GGUF/nomic-embed-text-v1.5.f32.gguf on LM Studio to work in @brianpetro incredible smart-connections plugin.

What I've done is essentially eliminate the checks on api.openai.com and instead just refactored it to direct to my local LM Studio server. Just beware that by doing so, you're taking away the ability to use the OpenAI embeddings, because we'll be refactoring the components that connect to them rather than adding on existing functionalities.

This is a quick and dirty fix for those who'd rather handle the embeddings locally, and its far from ideal, but it works really great for my use case.

Enjoy!
[…]

I have tried your code, however during the embedding process, LM studio shows the below error:

2024-10-04 10:11:02 [DEBUG]
llama_decode_internal: n_tokens == 0
llama_decode: failed to decode, ret = -1
2024-10-04 10:11:02 [DEBUG] [lmstudio-llama-cpp] LLM: Embedding failed: Failed during string embedding. Message: Unknown exception caused embedding to stop: Failed to decode batch! Error: n_tokens = 0
2024-10-04 10:11:02 [ERROR] [Server Error] {"title":"Failed to embed string","cause":"Failed during string embedding. Message: Unknown exception caused embedding to stop: Failed to decode batch! Error: n_tokens = 0"}

The .smart-env\multi shows incomplete embedding as many files are only 1kb.

@jagai
Copy link

jagai commented Oct 4, 2024

@daaain @ArtificialAmateur While this isn't an ideal solution, I did manage to set up nomic-ai/nomic-embed-text-v1.5-GGUF/nomic-embed-text-v1.5.f32.gguf on LM Studio to work in @brianpetro incredible smart-connections plugin.
What I've done is essentially eliminate the checks on api.openai.com and instead just refactored it to direct to my local LM Studio server. Just beware that by doing so, you're taking away the ability to use the OpenAI embeddings, because we'll be refactoring the components that connect to them rather than adding on existing functionalities.
This is a quick and dirty fix for those who'd rather handle the embeddings locally, and its far from ideal, but it works really great for my use case.
Enjoy!
[…]

I have tried your code, however during the embedding process, LM studio shows the below error:

2024-10-04 10:11:02 [DEBUG]
llama_decode_internal: n_tokens == 0
llama_decode: failed to decode, ret = -1
2024-10-04 10:11:02 [DEBUG] [lmstudio-llama-cpp] LLM: Embedding failed: Failed during string embedding. Message: Unknown exception caused embedding to stop: Failed to decode batch! Error: n_tokens = 0
2024-10-04 10:11:02 [ERROR] [Server Error] {"title":"Failed to embed string","cause":"Failed during string embedding. Message: Unknown exception caused embedding to stop: Failed to decode batch! Error: n_tokens = 0"}

The .smart-env\multi shows incomplete embedding as many files are only 1kb.

I'll need a little bit more info on this if possible.

Could you share which embedding model did you try, along with the version of Smart Connections? I'll do my best to help

@usernotnull
Copy link

usernotnull commented Oct 4, 2024

@jagai I am using smart-connections version 2.2.79 (although one literally just got pushed 2.2.80 but it doesn't affect our discussion).

This is my model loaded in LM Studio:
Screenshot 2024-10-04 163209

This is the obsidian settings:
image

And the main.js was edited exactly as you documented. I changed the tokens to 2048 in the object and the JSON later, thought maybe it would help, but didn't.

Here's a txt of the js:
main.txt

The embedding error is happening on certain files, but it's hard to figure out the issue as I have lots of files and I couldn't reach a point where I got 0 errors yet.

EDIT: I have renamed the files, removed metadata, cleaned the texts removing all that break json (,./*? etc), still getting the same issue. So the issue is not due to the content of the notes.

@brianpetro
Copy link
Owner

@usernotnull that's cool, thanks for sharing 🌴

@jagai
Copy link

jagai commented Oct 4, 2024

@jagai I am using smart-connections version 2.2.79 (although one literally just got pushed 2.2.80 but it doesn't affect our discussion).

This is my model loaded in LM Studio: Screenshot 2024-10-04 163209

This is the obsidian settings: image

And the main.js was edited exactly as you documented. I changed the tokens to 2048 in the object and the JSON later, thought maybe it would help, but didn't.

Here's a txt of the js: main.txt

The embedding error is happening on certain files, but it's hard to figure out the issue as I have lots of files and I couldn't reach a point where I got 0 errors yet.

EDIT: I have renamed the files, removed metadata, cleaned the texts removing all that break json (,./*? etc), still getting the same issue. So the issue is not due to the content of the notes.

Could you try switching on LM Studio to gaianet/Nomic-embed-text-v1.5-Embedding-GGUF/nomic-embed-text-v1.5.f16.gguf, give it another go and let me know how it goes?

@usernotnull
Copy link

usernotnull commented Oct 5, 2024

@jagai unfortunately same issue:

2024-10-05 15:17:50  [INFO] Received request to embed multiple:  ["A Folder > A Title\nBLOCK NOT FOUND (no line_start)"]
2024-10-05 15:17:50 [DEBUG] [INFO] [LlamaEmbeddingEngine] All parsed chunks succesfully embedded!
2024-10-05 15:17:50  [INFO] Returning embeddings (not shown in logs)
2024-10-05 15:17:50  [INFO] Received request to embed multiple:  ["Another Folder> Another Title:\n---\nup: [\"[[Somewhere]]\"]\nrelated: []\ntags: [o..."]
2024-10-05 15:17:50 [DEBUG]
llama_decode_internal: n_tokens == 0
llama_decode: failed to decode, ret = -1
2024-10-05 15:17:50 [DEBUG] [lmstudio-llama-cpp] LLM: Embedding failed: Failed during string embedding. Message: Unknown exception caused embedding to stop: Failed to decode batch! Error: n_tokens = 0
2024-10-05 15:17:50 [ERROR] [Server Error] {"title":"Failed to embed string","cause":"Failed during string embedding. Message: Unknown exception caused embedding to stop: Failed to decode batch! Error: n_tokens = 0"}

I also notice the issue with any local embedding model: BLOCK NOT FOUND (no line_start)

I went ahead and tested it on a sandbox vault, same issue:

2024-10-05 15:36:32  [INFO] Received request to embed multiple:  ["Plugins make Obsidian special for you:\nWe started making Obsidian with plugins in mind because every..."]
2024-10-05 15:36:33 [DEBUG] [INFO] [LlamaEmbeddingEngine] All parsed chunks succesfully embedded!
2024-10-05 15:36:33  [INFO] Returning embeddings (not shown in logs)
2024-10-05 15:36:33  [INFO] Received request to embed multiple:  ["Plugins make Obsidian special for you\nWe started making Obsidian with plugins in mind because everyo..."]
2024-10-05 15:36:33 [DEBUG] [INFO] [LlamaEmbeddingEngine] All parsed chunks succesfully embedded!
2024-10-05 15:36:33  [INFO] Returning embeddings (not shown in logs)
2024-10-05 15:36:33  [INFO] Received request to embed multiple:  ["Plugins make Obsidian special for you\n## Wild community plugins\r\n\r\nPlugins not just give Obsidian mo..."]
2024-10-05 15:36:33 [DEBUG] [INFO] [LlamaEmbeddingEngine] All parsed chunks succesfully embedded!
2024-10-05 15:36:33  [INFO] Returning embeddings (not shown in logs)
2024-10-05 15:36:33  [INFO] Received request to embed multiple:  ["Vault is just a local folder:\nDifferent than most note-taking apps out there, an Obsidian vault is n..."]
2024-10-05 15:36:33 [DEBUG] [INFO] [LlamaEmbeddingEngine] All parsed chunks succesfully embedded!
2024-10-05 15:36:33  [INFO] Returning embeddings (not shown in logs)
2024-10-05 15:36:33  [INFO] Received request to embed multiple:  ["Vault is just a local folder\nDifferent than most note-taking apps out there, an Obsidian vault is no..."]
2024-10-05 15:36:33 [DEBUG] [INFO] [LlamaEmbeddingEngine] All parsed chunks succesfully embedded!
2024-10-05 15:36:33  [INFO] Returning embeddings (not shown in logs)
2024-10-05 15:36:33  [INFO] Received request to embed multiple:  ["Start Here:\nHi, welcome to Obsidian!\n\n---\n\n## I’m interested in Obsidian\n\nFirst of all, tell me a li..."]
2024-10-05 15:36:33 [DEBUG] [INFO] [LlamaEmbeddingEngine] All parsed chunks succesfully embedded!
2024-10-05 15:36:33  [INFO] Returning embeddings (not shown in logs)
2024-10-05 15:36:33  [INFO] Received request to embed multiple:  ["Start Here\nHi, welcome to Obsidian!\n\n---\n\n## I’m interested in Obsidian\n\nFirst of all, tell me a lit..."]
2024-10-05 15:36:33 [DEBUG] [INFO] [LlamaEmbeddingEngine] All parsed chunks succesfully embedded!
2024-10-05 15:36:33  [INFO] Returning embeddings (not shown in logs)
2024-10-05 15:36:33  [INFO] Received request to embed multiple:  ["Start Here\n---\n\n## I’m interested in Obsidian\n\nFirst of all, tell me a little bit about what's your ..."]
2024-10-05 15:36:33 [DEBUG] [INFO] [LlamaEmbeddingEngine] All parsed chunks succesfully embedded!
2024-10-05 15:36:33  [INFO] Returning embeddings (not shown in logs)
2024-10-05 15:36:33  [INFO] Received request to embed multiple:  ["Start Here\n## What is this place?\n\nThis is a sandbox vault in which you can test various functionali..."]
2024-10-05 15:36:33 [DEBUG] [INFO] [LlamaEmbeddingEngine] All parsed chunks succesfully embedded!
2024-10-05 15:36:33  [INFO] Returning embeddings (not shown in logs)
2024-10-05 15:36:33  [INFO] Received request to embed multiple:  ["Adventurer > From plain-text note-taking:\nObsidian is similar to plain-text based note-taking apps i..."]
2024-10-05 15:36:33 [DEBUG] [INFO] [LlamaEmbeddingEngine] All parsed chunks succesfully embedded!
2024-10-05 15:36:33  [INFO] Returning embeddings (not shown in logs)
2024-10-05 15:36:33  [INFO] Received request to embed multiple:  ["Adventurer > From plain-text note-taking\nObsidian is similar to plain-text based note-taking apps in..."]
2024-10-05 15:36:33 [DEBUG] [INFO] [LlamaEmbeddingEngine] All parsed chunks succesfully embedded!
2024-10-05 15:36:33  [INFO] Returning embeddings (not shown in logs)
2024-10-05 15:36:33  [INFO] Received request to embed multiple:  ["Adventurer > From standard note-taking:\nGreat, that means you should already be familiar with taking..."]
2024-10-05 15:36:33 [DEBUG] [INFO] [LlamaEmbeddingEngine] All parsed chunks succesfully embedded!
2024-10-05 15:36:33  [INFO] Returning embeddings (not shown in logs)
2024-10-05 15:36:33  [INFO] Received request to embed multiple:  ["Adventurer > From standard note-taking\nGreat, that means you should already be familiar with taking ..."]
2024-10-05 15:36:33 [DEBUG] [INFO] [LlamaEmbeddingEngine] All parsed chunks succesfully embedded!
2024-10-05 15:36:33  [INFO] Returning embeddings (not shown in logs)
2024-10-05 15:36:33  [INFO] Received request to embed multiple:  ["Adventurer > No prior experience:\nThere are plenty of note-taking apps out there, so congratulations..."]
2024-10-05 15:36:33 [DEBUG] [INFO] [LlamaEmbeddingEngine] All parsed chunks succesfully embedded!
2024-10-05 15:36:33  [INFO] Returning embeddings (not shown in logs)
2024-10-05 15:36:33  [INFO] Received request to embed multiple:  ["Adventurer > No prior experience\nThere are plenty of note-taking apps out there, so congratulations ..."]
2024-10-05 15:36:33 [DEBUG] [INFO] [LlamaEmbeddingEngine] All parsed chunks succesfully embedded!
2024-10-05 15:36:33  [INFO] Returning embeddings (not shown in logs)
2024-10-05 15:36:33  [INFO] Received request to embed multiple:  ["Formatting > Callout:\nAs of v0.14.0, Obsidian supports callout blocks, sometimes called \"admonitions..."]
2024-10-05 15:36:33 [DEBUG]
llama_decode_internal: n_tokens == 0
llama_decode: failed to decode, ret = -1
2024-10-05 15:36:33 [DEBUG] [lmstudio-llama-cpp] LLM: Embedding failed: Failed during string embedding. Message: Unknown exception caused embedding to stop: Failed to decode batch! Error: n_tokens = 0
2024-10-05 15:36:33 [ERROR] [Server Error] {"title":"Failed to embed string","cause":"Failed during string embedding. Message: Unknown exception caused embedding to stop: Failed to decode batch! Error: n_tokens = 0"}

@jagai
Copy link

jagai commented Oct 5, 2024

@usernotnull I couldn't reproduce the errors on my end. I'm not entirely sure its related to Obsidian or Smart Connections. Could be something to do with LM Studio, but I'm really not sure.

@usernotnull
Copy link

@usernotnull I couldn't reproduce the errors on my end. I'm not entirely sure its related to Obsidian or Smart Connections. Could be something to do with LM Studio, but I'm really not sure.

Which OS are you on?
Mine is win11.

@jagai
Copy link

jagai commented Oct 5, 2024

@usernotnull I couldn't reproduce the errors on my end. I'm not entirely sure its related to Obsidian or Smart Connections. Could be something to do with LM Studio, but I'm really not sure.

Which OS are you on? Mine is win11.

I'm on macOS Sequoia, using Obsidian with my Macbook Air M1... Would be even more difficult for me to help as I've never tried running Obsidian or LM Studio on Windows to be honest ☹️

@jagai
Copy link

jagai commented Oct 5, 2024

@usernotnull A long shot, but since you're on Windows, perhaps giving the mixedbread-ai/mxbai-embed-large-v1 model a shot might yield better results?

@jagai
Copy link

jagai commented Oct 6, 2024

@usernotnull I've managed to narrow this down to LM Studio 0.3.3. For some reason it causes the models to fail embedding. Tested on LM Studio 0.3.2 and Smart Connections 2.2.81.

You can find LM Studio 0.3.2 at the bottom of the download page (https://lmstudio.ai/download).

Let me know if this works 😃

@usernotnull
Copy link

@usernotnull I've managed to narrow this down to LM Studio 0.3.3. For some reason it causes the models to fail embedding. Tested on LM Studio 0.3.2 and Smart Connections 2.2.81.

You can find LM Studio 0.3.2 at the bottom of the download page (https://lmstudio.ai/download).

Let me know if this works 😃

You did it 🏆
Thanks :)

Any idea if LM Studio is aware of this issue?

@jagai
Copy link

jagai commented Oct 6, 2024

@usernotnull I've managed to narrow this down to LM Studio 0.3.3. For some reason it causes the models to fail embedding. Tested on LM Studio 0.3.2 and Smart Connections 2.2.81.
You can find LM Studio 0.3.2 at the bottom of the download page (https://lmstudio.ai/download).
Let me know if this works 😃

You did it 🏆 Thanks :)

Any idea if LM Studio is aware of this issue?

Glad it works! Woohoo 🥳
I'm not sure LM Studio is aware of the issue though. Probably would be a good idea to let them know 😄

@usernotnull
Copy link

usernotnull commented Oct 9, 2024

The LM Studio issue has been resolved.
@jagai 's temporary workaround now works well for local embeddings.

@davedawkins
Copy link

Would anyone who has this working in the latest SmartPlugins and LM studio care to share settings please?
Specifically, the "main" settings for SmartPlugins (where we choose the local embedding models etc), and the settings accessed from the chat window where we set up connection to the local server. I have most of that OK, but can't work out if I should leave API key empty or not.
Then, what model is loaded into LM studio, and any special settings there.
Thank you

@brianpetro
Copy link
Owner

@davedawkins I haven't tested Lm studio in a while, it might need a custom adapter if it doesn't strictly follow OpenAI API format.

I'd expect the API key could be left empty if you haven't configured any API key in lm studio.

If it's not working, screenshot any errors that appear in the developer console logs and I'll give them a look 🌴

@davedawkins
Copy link

Thanks @brianpetro - to that end I have just logged an issue related to a "Model not set" error. My guess is that the OpenAI configuration is being unnecessarily validated (and failing) but I'm not sure.

@brianpetro
Copy link
Owner

@davedawkins are you on the latest version? I thought I already fixed that issue in the last release.

If it's still happening on the latest release, a screenshot of the error would be helpful 🌴

@brianpetro
Copy link
Owner

@davedawkins never mind, I just saw the other issue #931 🌴

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

7 participants