Skip to content

A Python library that uses AI to convert unstructured files (like PDFs, HTML, etc.) into structured data.

Notifications You must be signed in to change notification settings

mixpeek/file-processor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 

Repository files navigation

FastAPI File Processor

This is a FastAPI application that accepts a file URL, fetches the file, partitions it, and sends the first chunk to an OpenAI GPT-4 model for processing.

Installation

  1. Clone this repository:
    git clone https://github.com/yourusername/yourrepository.git
  2. Navigate to the project directory:
    cd yourrepository
  3. Install Poetry if you haven't already:
    curl -sSL https://install.python-poetry.org | python -
  4. Install the required Python packages:
    poetry install

Usage

  1. Start the FastAPI server:
    poetry run uvicorn main:app --reload
  2. Send a POST request to the /process endpoint with a JSON body that contains the url parameter. Replace http://example.com/path/to/your/file with the actual URL of the file you want to process:
    curl -X POST "http://localhost:8000/process" -H  "accept: application/json" -H  "Content-Type: application/json" -d "{\"url\":\"http://example.com/path/to/your/file\"}"

API Key

The application uses an OpenAI API key for the GPT-4 model. Make sure to replace the placeholder API key in the main.py file with your actual OpenAI API key.

License

This project is licensed under the terms of the MIT license.

About

A Python library that uses AI to convert unstructured files (like PDFs, HTML, etc.) into structured data.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages