📄 PDF Items List Extractor and CSV Utility Tool

A versatile tool designed to streamline the extraction of list items from PDF documents and the merging of CSV files, ensuring unique identification across datasets.

🛠️ Features

Extract Items from PDF: Convert list-like structures in PDF documents into structured CSV format.
Merge CSV Files: Combine multiple CSV files into a single file, maintaining unique IDs through a newly generated sequential ID column.

🖥️ Prerequisites

Python 3.6+
PyMuPDF (fitz)
Pandas
Typer

🚀 Installation

Clone the repository and install dependencies:

git clone https://github.com/GeroZayas/PDF-itemslist-extractor.git

cd PDF-itemslist-extractor

pip install -r requirements.txt

📝 Usage

Extract Items from PDF

python your_script_name.py extract_and_save./path/to/your/pdf/file.pdf./desired/output/path/

Merge Multiple CSV Files

python your_script_name.py merge_csv_files./file1.csv./file2.csv./merged_output.csv

📁 Example

Assuming you have a PDF named example.pdf and two CSV files named data1.csv and data2.csv, you can extract items from the PDF and merge the CSV files as follows:

python your_script_name.py extract_and_save./example.pdf./extracted_items.csv

python your_script_name.py merge_csv_files./data1.csv./data2.csv./merged_data.csv

🎯 Contributing

Contributions are welcome Feel free to submit a pull request or open an issue to discuss improvements or report bugs.

👤 Author

Gero Zayas - @gerozayas

📧 Contact

📧 gerozayas@gmail.com

🌐 Gero Zayas Portfolio

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

📄 PDF Items List Extractor and CSV Utility Tool

🛠️ Features

🖥️ Prerequisites

🚀 Installation

📝 Usage

📁 Example

🎯 Contributing

👤 Author

📧 Contact

Files

README.md

Latest commit

History

README.md

File metadata and controls

📄 PDF Items List Extractor and CSV Utility Tool

🛠️ Features

🖥️ Prerequisites

🚀 Installation

📝 Usage

📁 Example

🎯 Contributing

👤 Author

📧 Contact