This repo is for exploring language datasets and models on German politics. Built as part of Data Science Retreat (DSR) Berlin.
Python version: 3.10.13
Install requirements:
pip install -r requirements.txt
So far, this repo includes the following:
get_bundestag_protocols.ipynb
: Pulling parliament protocols via the Bundestag API.get_party_manifestos.ipynb
: Extracting the raw text from party manifestos ("Parteiprogramme) in PDF format.get_wahlomat_responses.ipynb
: Extracting the statements and responses from the "Wahl-O-Mat" tool for voter decision-making. Potentially useful for generating prompts and for a "ground truth" response of a party.rag.ipynb
: A basic exploration of retrieval-augmented generation (RAG) within this context.