[Question] When no partition handler specified, the orchestrator pipeline doesn't generate embeddings nor save the records #589
Replies: 8 comments
-
|
Default workflow:
You're removing step 2, so handler 3 doesn't find the partition files to work with. You might have to create new handlers if you want to implement a custom workflow. |
Beta Was this translation helpful? Give feedback.
-
|
I didn't se any documentation about that. Might also be interesting to have a warning when no partition is present for the gen_embeddings handler? |
Beta Was this translation helpful? Give feedback.
-
some new notes here: https://microsoft.github.io/kernel-memory/service/architecture |
Beta Was this translation helpful? Give feedback.
-
|
What about adding a warning in handler specifying that no partition are present in the data pipeline to process? |
Beta Was this translation helpful? Give feedback.
-
That should be easy to implement, and maybe add similar logs also in other handlers in case they don't find any data to work with. |
Beta Was this translation helpful? Give feedback.
-
|
Work in progress here https://github.com/microsoft/kernel-memory/pull/304/files#diff-778fe859892cfbd60f9c90403e5504b96ef94276d4fc77522c76f7a778dbbee4. The PR includes other changes and it applies to the dev branch, so it will take some time before a release. |
Beta Was this translation helpful? Give feedback.
-
|
Warnings added |
Beta Was this translation helpful? Give feedback.
-
|
Thank you ! |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Context / Scenario
I'm trying to create a stupid pipeline, that take a file as input that is quite small (<500 tokens).
I'ld like to simply vectorize my input and save it into my DB without any change.
What happened?
When not using the partition handler, the pipeline doesn't create any embeddings and noting is saved.
Furthermore no logs is generated indicating this issue.
Importance
a fix would make my life easier
Platform, Language, Versions
.NET C#
Relevant log output
When using the partitionning: Handler 'partition' processed pipeline '<indexname>/<filename>' successfully dbug: Microsoft.KernelMemory.Pipeline.BaseOrchestrator[0] Saving pipeline status to '<indexname>/<filename>/__pipeline_status.json' dbug: Microsoft.KernelMemory.Handlers.GenerateEmbeddingsHandler[0] Generating embeddings, pipeline '<indexname>/<filename>' trce: Microsoft.KernelMemory.Handlers.GenerateEmbeddingsHandler[0] Skipping file <filename>.txt.extract.txt (not a partition, not synthetic data) trce: Microsoft.KernelMemory.Handlers.GenerateEmbeddingsHandler[0] Processing file <filename>.txt.partition.0.txt trce: Microsoft.KernelMemory.Handlers.GenerateEmbeddingsHandler[0] Generating embeddings using AI.AzureOpenAI.AzureOpenAITextEmbeddingGenerator, file: <filename>.txt.partition.0.txt dbug: Microsoft.KernelMemory.Handlers.GenerateEmbeddingsHandler[0] Saving embedding file <filename>.txt.partition.0.txt.AI.AzureOpenAI.AzureOpenAITextEmbeddingGenerator.TODO.text_embedding info: Microsoft.KernelMemory.Pipeline.BaseOrchestrator[0] Handler 'gen_embeddings' processed pipeline '<indexname>/<filename>' successfully dbug: Microsoft.KernelMemory.Pipeline.BaseOrchestrator[0] Saving pipeline status to '<indexname>/<filename>/__pipeline_status.json' dbug: Microsoft.KernelMemory.Handlers.SaveRecordsHandler[0] Saving memory records, pipeline '<indexname>/<filename>' trce: Microsoft.KernelMemory.Handlers.SaveRecordsHandler[0] Creating index '<indexname>' trce: Microsoft.KernelMemory.Handlers.SaveRecordsHandler[0] Saving record d=<filename>//p=02a9839039164f40ba153329115cc4e3 in index '<indexname>' info: Microsoft.KernelMemory.Pipeline.BaseOrchestrator[0] Handler 'save_records' processed pipeline '<indexname>/<filename>' successfully dbug: Microsoft.KernelMemory.Pipeline.BaseOrchestrator[0] Saving pipeline status to '<indexname>/<filename>/__pipeline_status.json' info: Microsoft.KernelMemory.Pipeline.BaseOrchestrator[0] Pipeline '<indexname>/<filename>' complete ----- When not using the partitionning: Uploading file '<filename>.txt', size 171 bytes info: Microsoft.KernelMemory.Pipeline.BaseOrchestrator[0] File uploaded: <filename>.txt, 171 bytes dbug: Microsoft.KernelMemory.Pipeline.BaseOrchestrator[0] Saving pipeline status to '<indexname>/<filename>/__pipeline_status.json' dbug: Microsoft.KernelMemory.Pipeline.BaseOrchestrator[0] Saving pipeline status to '<indexname>/<filename>/__pipeline_status.json' dbug: Microsoft.KernelMemory.Handlers.TextExtractionHandler[0] Extracting text, pipeline '<indexname>/<filename>' dbug: Microsoft.KernelMemory.Handlers.TextExtractionHandler[0] Extracting text from plain text file <filename>.txt dbug: Microsoft.KernelMemory.Handlers.TextExtractionHandler[0] Saving extracted text file <filename>.txt.extract.txt info: Microsoft.KernelMemory.Pipeline.BaseOrchestrator[0] Handler 'extract' processed pipeline '<indexname>/<filename>' successfully dbug: Microsoft.KernelMemory.Pipeline.BaseOrchestrator[0] Saving pipeline status to '<indexname>/<filename>/__pipeline_status.json' dbug: Microsoft.KernelMemory.Handlers.GenerateEmbeddingsHandler[0] Generating embeddings, pipeline '<indexname>/<filename>' trce: Microsoft.KernelMemory.Handlers.GenerateEmbeddingsHandler[0] Skipping file <filename>.txt.extract.txt (not a partition, not synthetic data) info: Microsoft.KernelMemory.Pipeline.BaseOrchestrator[0] Handler 'gen_embeddings' processed pipeline '<indexname>/<filename>' successfully dbug: Microsoft.KernelMemory.Pipeline.BaseOrchestrator[0] Saving pipeline status to '<indexname>/<filename>/__pipeline_status.json' dbug: Microsoft.KernelMemory.Handlers.SaveRecordsHandler[0] Saving memory records, pipeline '<indexname>/<filename>' info: Microsoft.KernelMemory.Pipeline.BaseOrchestrator[0] Handler 'save_records' processed pipeline '<indexname>/<filename>' successfully dbug: Microsoft.KernelMemory.Pipeline.BaseOrchestrator[0] Saving pipeline status to '<indexname>/<filename>/__pipeline_status.json' info: Microsoft.KernelMemory.Pipeline.BaseOrchestrator[0] Pipeline '<indexname>/<filename>' completeBeta Was this translation helpful? Give feedback.
All reactions