[Bug]: non-latin letters in file names are stripped #2703
Labels
area: backend
Related to backend functionality or under the /backend directory
bug
Something isn't working
What happened?
non-ascii letters removed from file names.
![image](https://private-user-images.githubusercontent.com/807522/341996093-5ef33d46-2ef6-4c32-92f9-7ca65ab1b76d.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjExOTg5MTMsIm5iZiI6MTcyMTE5ODYxMywicGF0aCI6Ii84MDc1MjIvMzQxOTk2MDkzLTVlZjMzZDQ2LTJlZjYtNGMzMi05MmY5LTdjYTY1YWIxYjc2ZC5wbmc_WC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBVkNPRFlMU0E1M1BRSzRaQSUyRjIwMjQwNzE3JTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDI0MDcxN1QwNjQzMzNaJlgtQW16LUV4cGlyZXM9MzAwJlgtQW16LVNpZ25hdHVyZT01MjdkZWU2YWE5ZDJmYzI4OWM1Mjc4OWNkYjcyYjNmMTQ1ZGM2N2EyODFiMWIyNWNmMzM0MGFjYjkzYTkyYzc0JlgtQW16LVNpZ25lZEhlYWRlcnM9aG9zdCZhY3Rvcl9pZD0wJmtleV9pZD0wJnJlcG9faWQ9MCJ9.HLN-5y1Y6sxziJKBG645G3SW2wv0CxAVDgNo0BwmQak)
it's caused by using file name as a key for storing upload https://github.com/QuivrHQ/quivr/blob/main/backend/modules/upload/service/upload_file.py#L81and upload keys are restricted supabase/storage#133 Thus Quivr removes non-ascii due to #1728
I made an attempt to fix it, but not pleased with the result. I think it's worth to add
original_file_name
column intoknowledge
table and identify file uploads by uuid. WDYT?Relevant log output
No response
Twitter / LinkedIn details
No response
The text was updated successfully, but these errors were encountered: