-
Notifications
You must be signed in to change notification settings - Fork 4.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Integrated vectorization - sourcefile and storageUrl is null #2279
Comments
@pamelafox Any thoughts on this issue would be much appreciated. Thank you |
I can help with the root cause analysis. If you go to the integrated vectorization file (app/backend/prepdocslib/integratedvectorizerstrategy.py), on lines 100-102 you will see where the index field mappings are taking place. The sourcefile and storageUrl attributes are not mapped, and I'm not sure if that is on purpose or not. selectors=[
SearchIndexerIndexProjectionSelector(
target_index_name=index_name,
parent_key_field_name="parent_id",
source_context="/document/pages/*",
mappings=[
InputFieldMappingEntry(name="content", source="/document/pages/*"),
InputFieldMappingEntry(name="embedding", source="/document/pages/*/vector"),
InputFieldMappingEntry(name="sourcepage", source="/document/metadata_storage_name"),
],
),
], |
?? |
Hi, I am also having the same issue. Tried mapping sourcefile and storageurl in the skills section but it seems to not be working. Not sure where to view the necessary metadata
|
For data ingestion, I wanted to enable integrated vectorization. I followed the instructions provided in the docs -> data_ingestion.md file
All my documents are present in blob storage.
After the indexer successfully ran, I used the app, the citation url was wrong, it wasnt linked to the file in blob storage.
When I looked at the index, the sourcefile and storageUrl variables were null, which is messing with my citations.
The citation does not have an extension, just has the file name. I am not sure why
The text was updated successfully, but these errors were encountered: