Convert PDFs with a focus on academic papers into human-and-LLM-friendly text-only Markdown files.
- Text, tables and equations are parsed using Mistral OCR.
- Figures are converted to a textual description using a selected vision model (see below).
- Additional postprocessing is available, such as splitting the file into multiple parts (main, appendix, backmatter) and fetching a bibtex.
- Example: We converted all our research group papers to Markdown in this repo.
- You need a Mistral AI API key to use
paper2llm
. Their free API tier is compatible withpaper2llm
, within rate limits. - For the image-to-text conversion, multiple providers are supported.
- You should read the API Keys Security Guide before using the app with your API keys.
paper2llm
was written by Luigi Acerbi using Claude 3.7 Sonnet and Athanor.
You can follow me on X and Bluesky.
After the OCR step, figures are converted to a Markdown text description using vision models such as Mistral AI's Mistral Small or Google's Gemini 2.0 Flash. You can select the desired vision model via a dropdown menu, based on which API keys you entered.
Notes on vision models choice.
- Both Mistral AI and Google Gemini offer a free API tier.
- Gemini 2.0 Flash is our currently recommended model for
paper2llm
. It is included in the Gemini API free tier or otherwise very cheap, and shows very good performance. - If you prefer to stick to only using the Mistral AI API, the default free Mistral AI model, Mistral Small, is a top-performing model in its size category and works generally well.
- Pixtral Large may work better for understanding complex diagrams and concepts, but it's a premier model; the API call is not rejected, but it might redirect to a free model if no API credits are available.
- Other premium models such as OpenAI's GPT-4o, Anthropic's Claude Sonnet 3.7 or Google Gemini 2.0 Pro might work better for complex figures, but beware of API costs.
- We have no affiliation or financial relationship with Mistral AI, besides sympathy for a European AI company and appreciation for their AI models, nor with any other LLM providers.
- This is a research preview, as they say. Use at your own risk and with all the caveats of modern AI and LLM usage.
- In particular, image descriptions might be off in clear or subtle ways and you should double-check and fix them as needed.
paper2llm
is released under the terms of the MIT License.