The sejm_info repository presents an innovative tool designed to streamline the analysis of legal act projects from the Polish parliament (Sejm). Utilizing the power of OpenAI's ChatGPT API coupled with Python, this toolkit offers a robust platform for automatically downloading, summarizing, and interpreting legislative documents. By leveraging natural language processing, sejm_info transforms complex legal texts into concise summaries, facilitating a clearer understanding of legislative intent and nuances.
This project aims to provide researchers, legal professionals, and interested citizens with a user-friendly interface to explore and analyze legislative proposals efficiently. Whether for academic research, professional review, or public awareness, sejm_info serves as your go-to resource for digesting the intricacies of Polish legislative works through the lens of AI.
Yes, the description was generated by chatGpt;) this is a simple POC I did. Screenshots below:
https://github.com/msporna/sejminfo/blob/main/sejm_parser/chatgpt_prompt.txt https://github.com/msporna/sejminfo/blob/main/sejm_parser/chatgpt_prompt_hashtags.txt
-
set openAI api key in env variable
-
run sejm_parser (main.py) manually, it will update sqlite db with new records a) it will backup previously downloaded files into zip b) it will copy sqlite to the /web folder after it's done
-
restart web after that
-
new data will be served by the frontend
-
if links to sejm change - go to parser/sql_handler and links are built there.
OPENAI_API_KEY='{key_here}' python main.py
IMPORTANT: install tesseract:
sudo apt install tesseract-ocr sudo apt install libtesseract-dev sudo apt-get install tesseract-ocr-pol
summary of 45 files costs about 2$ (chat gpt 3.5 tubo)
- clone latest
- root folder
- run:
docker-compose down
- run:
docker-compose up -d --build