Markdrop

A Python package for converting PDFs to markdown while extracting images and tables, generate descriptive text descriptions for extracted tables/images using several LLM clients. And many more functionalities. Markdrop is available on PyPI.

Features

Installation

pip install markdrop

If you are using the CLI, you can install the package in editable mode:

python -m pip install -e .

Python Package Index (PyPI) Page: https://pypi.org/project/markdrop

Quick Start

Using the MarkDrop CLI

After installing the package, you can use the markdrop command-line interface.

1. Convert PDF to Markdown and HTML:

markdrop convert <input_path> --output_dir <output_directory> [--add_tables]

<input_path>: Path or URL to the input PDF file.
<output_directory>: Directory to save output files (default: output).
--add_tables: (Optional) Add downloadable tables to the HTML output.

Example:

markdrop convert my_document.pdf --output_dir processed_docs --add_tables

2. Generate Descriptions for Images and Tables in a Markdown File:

markdrop describe <input_path> --output_dir <output_directory> --ai_provider <provider> [--remove_images] [--remove_tables]

<input_path>: Path to the markdown file.
<output_directory>: Directory to save the processed file (default: output).
<provider>: AI provider to use (gemini or openai).
--remove_images: (Optional) Remove images from the markdown file.
--remove_tables: (Optional) Remove tables from the markdown file.

Example:

markdrop describe my_markdown.md --output_dir described_content --ai_provider gemini --remove_images

3. Analyze Images in a PDF File:

markdrop analyze <input_path> --output_dir <output_directory> [--save_images]

<input_path>: Path or URL to the PDF file.
<output_directory>: Directory to save analysis results (default: output/analysis).
--save_images: (Optional) Save extracted images.

Example:

markdrop analyze report.pdf --output_dir pdf_analysis --save_images

4. Set Up API Keys for AI Providers:

markdrop setup <provider>

<provider>: The AI provider to set up (gemini or openai).

Example:

markdrop setup gemini

5. Generate Descriptions for Images (Standalone):

markdrop generate <input_path> --output_dir <output_directory> [--prompt <prompt_text>] [--llm_client <client1> <client2> ...]

<input_path>: Path to an image file or a directory of images.
<output_directory>: Directory to save the descriptions CSV (default: output/descriptions).
--prompt: (Optional) Prompt for the AI model (default: "Describe the image in detail.").
--llm_client: (Optional) List of LLM clients to use (default: gemini). Available: qwen, gemini, openai, llama-vision, molmo, pixtral.

Example:

markdrop generate my_images/ --output_dir image_descriptions --prompt "What is in this picture?" --llm_client gemini openai

Advanced PDF Processing with MarkDrop (Python API)

from markdrop import markdrop, MarkDropConfig, add_downloadable_tables
from pathlib import Path
import logging

# Configure processing options
config = MarkDropConfig(
    image_resolution_scale=2.0,        # Scale factor for image resolution
    download_button_color='#444444',   # Color for download buttons in HTML
    log_level=logging.INFO,           # Logging detail level
    log_dir='logs',                   # Directory for log files
    excel_dir='markdropped-excel-tables'  # Directory for Excel table exports
)

# Process PDF document
input_doc_path = "path/to/input.pdf"
output_dir = Path('output_directory')

# Convert PDF and generate HTML with images and tables
html_path = markdrop(input_doc_path, str(output_dir), config)

# Add interactive table download functionality
downloadable_html = add_downloadable_tables(html_path, config)

AI-Powered Content Analysis (Python API)

from markdrop import setup_keys, process_markdown, ProcessorConfig, AIProvider, logger
from pathlib import Path

# Set up API keys for AI providers
setup_keys(key='gemini')  # or setup_keys(key='openai')

# Configure AI processing options
config = ProcessorConfig(
    input_path="path/to/markdown/file.md",    # Input markdown file path
    output_dir=Path("output_directory"),      # Output directory
    ai_provider=AIProvider.GEMINI,            # AI provider (GEMINI or OPENAI)
    remove_images=False,                      # Keep or remove original images
    remove_tables=False,                      # Keep or remove original tables
    table_descriptions=True,                  # Generate table descriptions
    image_descriptions=True,                  # Generate image descriptions
    max_retries=3,                           # Number of API call retries
    retry_delay=2,                           # Delay between retries in seconds
    gemini_model_name="gemini-2.5-flash",    # Gemini model for images
    gemini_text_model_name="gemini--2.5-flash",     # Gemini model for text
    image_prompt=DEFAULT_IMAGE_PROMPT,        # Custom prompt for image analysis
    table_prompt=DEFAULT_TABLE_PROMPT         # Custom prompt for table analysis
)

# Process markdown with AI descriptions
output_path = process_markdown(config)

Image Description Generation (Python API)

from markdrop import generate_descriptions

prompt = "Give textual highly detailed descriptions from this image ONLY, nothing else."
input_path = 'path/to/img_file/or/dir'
output_dir = 'data/output'
llm_clients = ['gemini', 'llama-vision']  # Available: ['qwen', 'gemini', 'openai', 'llama-vision', 'molmo', 'pixtral']

generate_descriptions(
    input_path=input_path,
    output_dir=output_dir,
    prompt=prompt,
    llm_client=llm_clients
)

API Reference

Core Functions

markdrop(input_doc_path: str, output_dir: str, config: Optional[MarkDropConfig] = None) -> Path

Converts PDF to markdown and HTML with enhanced features.

Parameters:

input_doc_path (str): Path to input PDF file
output_dir (str): Output directory path
config (MarkDropConfig, optional): Configuration options for processing

add_downloadable_tables(html_path: Path, config: Optional[MarkDropConfig] = None) -> Path

Adds interactive table download functionality to HTML output.

Parameters:

html_path (Path): Path to HTML file
config (MarkDropConfig, optional): Configuration options

Configuration Classes

MarkDropConfig

Configuration for PDF processing:

image_resolution_scale (float): Scale factor for image resolution (default: 2.0)
download_button_color (str): HTML color code for download buttons (default: '#444444')
log_level (int): Logging level (default: logging.INFO)
log_dir (str): Directory for log files (default: 'logs')
excel_dir (str): Directory for Excel table exports (default: 'markdropped-excel-tables')

ProcessorConfig

Configuration for AI processing:

input_path (str): Path to markdown file
output_dir (str): Output directory path
ai_provider (AIProvider): AI provider selection (GEMINI or OPENAI)
remove_images (bool): Whether to remove original images
remove_tables (bool): Whether to remove original tables
table_descriptions (bool): Generate table descriptions
image_descriptions (bool): Generate image descriptions
max_retries (int): Maximum API call retries
retry_delay (int): Delay between retries in seconds
gemini_model_name (str): Gemini model for image processing
gemini_text_model_name (str): Gemini model for text processing
image_prompt (str): Custom prompt for image analysis
table_prompt (str): Custom prompt for table analysis

Contributing

We welcome contributions! Please see our Contributing Guidelines for details.

Development Setup

Clone the repository:

git clone https://github.com/shoryasethia/markdrop.git  
cd markdrop

Create a virtual environment:

python -m venv venv  
source venv/bin/activate  # On Windows: venv\Scripts\activate

Install development dependencies:

pip install -r requirements.txt

Project Structure

markdrop/  
├── LICENSE  
├── README.md  
├── CONTRIBUTING.md  
├── CHANGELOG.md  
├── requirements.txt  
├── setup.py  
└── markdrop/ 
    ├── __init__.py 
    ├── src
    |    └── markdrop-logo.png
    ├── main.py
    ├── process.py
    ├── api_setup.py
    ├── parse.py
    ├── utils.py  
    ├── helper.py
    ├── ignore_warnings.py
    ├── run.py
    └── models/
        ├── __init__.py
        ├── .env
        ├── img_descriptions.py
        ├── logger.py
        ├── model_loader.py
        ├── responder.py
        └── setup_keys.py

Star History

License

This project is licensed under the MIT License - see the LICENSE file for details.

Changelog

See CHANGELOG.md for version history.

Code of Conduct

Please note that this project follows our Code of Conduct.

Support

Open an issue

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Markdrop

Features

Installation

Python Package Index (PyPI) Page: https://pypi.org/project/markdrop

Quick Start

Using the MarkDrop CLI

Advanced PDF Processing with MarkDrop (Python API)

AI-Powered Content Analysis (Python API)

Image Description Generation (Python API)

API Reference

Core Functions

markdrop(input_doc_path: str, output_dir: str, config: Optional[MarkDropConfig] = None) -> Path

add_downloadable_tables(html_path: Path, config: Optional[MarkDropConfig] = None) -> Path

Configuration Classes

MarkDropConfig

ProcessorConfig

Contributing

Development Setup

Project Structure

Star History

License

Changelog

Code of Conduct

Support

About

Uh oh!

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
markdrop		markdrop
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
OLD-DOCUMENTATION.md		OLD-DOCUMENTATION.md
README.md		README.md
markdrop_test.ipynb		markdrop_test.ipynb
requirements.txt		requirements.txt
setup.py		setup.py

License

shoryasethia/markdrop

Folders and files

Latest commit

History

Repository files navigation

Markdrop

Features

Installation

Python Package Index (PyPI) Page: https://pypi.org/project/markdrop

Quick Start

Using the MarkDrop CLI

Advanced PDF Processing with MarkDrop (Python API)

AI-Powered Content Analysis (Python API)

Image Description Generation (Python API)

API Reference

Core Functions

markdrop(input_doc_path: str, output_dir: str, config: Optional[MarkDropConfig] = None) -> Path

add_downloadable_tables(html_path: Path, config: Optional[MarkDropConfig] = None) -> Path

Configuration Classes

MarkDropConfig

ProcessorConfig

Contributing

Development Setup

Project Structure

Star History

License

Changelog

Code of Conduct

Support

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Uh oh!

Languages