Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Convert PDF to Markdown inside Chat UI #441

Closed
julien-blanchon opened this issue Sep 16, 2023 · 11 comments
Closed

Feature Request: Convert PDF to Markdown inside Chat UI #441

julien-blanchon opened this issue Sep 16, 2023 · 11 comments

Comments

@julien-blanchon
Copy link
Contributor

julien-blanchon commented Sep 16, 2023

Feature Description:

I'd like to propose the following feature: Add a PDF to Markdown converter and they include this Markdown content directly in the chat. This feature would not only enhance the user experience but also provide a seamless way to discuss and reference content from PDFs, especially research papers.

It's worth noting that similar features exist in platforms like the Anthropic Claude 2 and Perplexity.AI interface. However, their implementations primarily focus on a basic text extraction, which often results in a lossy conversion. Specifically, with mathematical equations, tables, and other complex formatting are not retaine. My proposed implementation aims to address this limitation by ensuring a more comprehensive and lossless conversion using the Mathpix Paid API, and maybe other service in the future.

Design Implementation:

  • Dropdown on Hover: When a user hovers the chat form with a file, a dropdown menu will appear. This dropdown will provide an option to "Upload PDF".
  • Once the file is released this will send the file to be converted.
  • After the conversion, the markdown content will be append in the chat.

Example

demo (1)

Pull Request

#442

This was referenced Sep 18, 2023
@nsarrazin
Copy link
Collaborator

Hey @julien-blanchon thanks for contributing all of this! I'm a bit busy atm with the agents feature but I have seen your PR #449 and I'll review it once I have a bit more time 😁 Just so you know it hasn't been forgotten haha

@julien-blanchon
Copy link
Contributor Author

Yes, but this is still a draft. As it don't fit the usage of 99% of the hf-chat user I don't know if we sould merge it or not (or maybe disable the feature by default).
However we can maybe generalize the file features with:

  • The file drag and drop compotents
  • A abstract system for processing file object dependings of the user settings (process to mathpix could be one of them).

@nsarrazin
Copy link
Collaborator

Yeah I think in particular your drag&drop feature looks great! Would it be okay if we used it for agents?

@julien-blanchon
Copy link
Contributor Author

Yes of course, you're talking about #462 ?

@nsarrazin
Copy link
Collaborator

Yeah! We already support in the backend uploading images & audios for agents (stuff like speech transcription, image description), there's just no UI component for it so I think yours could work well when we implement that feature!

@hungryalgo
Copy link

@julien-blanchon
Adding file support, especially PDF is such an important feature, are you going to merge #442 later? What is the mostly current branch that this is worked on? I see #449, but its empty.

I do not see that agent branch end up using the file drop UI.

Just any updates on how we want to go from here is nice. Why is this PR dropped? I would love to see this merged to main.

Many people want this: #609, #482

thanks! very nice to see that #442.

@gary149
Copy link
Collaborator

gary149 commented Dec 6, 2023

Hi, good news! @mishig25 has started working on this feature so it should come soon!

@julien-blanchon
Copy link
Contributor Author

@mishig25 Feel free to tag me when a PR is out, I would love to help testing 😍.

@iChristGit
Copy link

Should this mean we could also add Text files as well as HTML files? this would be amazing

@mishig25
Copy link
Collaborator

@julien-blanchon please feel free to test #641

@jackbravo
Copy link

Closing in favor of #609, right? Which is the one that the referenced PR (641) closes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants