This repository was archived by the owner on Aug 2, 2024. It is now read-only.
forked from natgaertner/bip-data
-
Notifications
You must be signed in to change notification settings - Fork 0
CTCL/bip-data
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
This is essentially the Ballot Information Project. The working branch is in new_nat bipbuild.py builds the BIP database for each state you pass it. It's mostly based on ersatz: https://github.com/natgaertner/ersatzpg but has a bunch of its own config ideas. You can see some sample state configs in vf_scripts/exception_states. vf_scripts in general has a ton of quick and dirty script solutions that helped me manage the parallel state directories. make_state_confs.py might be the most important one, since it uses the state_conf_template.py and exception state files to create configurations for each state that reference the correct locations and names. bip_build.py has a lot of functions, but you usually just run -all_no_clean which ditches the old data, inserts new data, cleans up duplicate districts, merges tables that need to be merged, maps sequential keys correctly, and dumps json. running -all will actually drop and remake the partitioned tables. This is a little more drastic than you might want. Generally, the database is built by importing into a buffer table, then performing some SQL commands to square everything away with keying before inserting into the actual table. A lot of the ugly table creation and keying commands are run using functions in schema/table_tools.py. schema/create_partitions.py contains a partition generator that is pretty useful. It also contains a Permutation generator. Did I know about itertools.permutations and itertools.combinations? I did not. schema/process_schema.py has some attempts to make python classes that represent a database schema as far as I needed it to. Some of the ugliest SQL generation happens in here for rekeying. It could probably get a lot cleaner. At one point I was trying to build a tree of foreign key relations and detect loops and use that to generate rekeying commands after data was loaded in. This became unnecessary, thankfully. We really didn't need a lot of explicit relations, since we were dumping the separate tables out to json anyway. Some of the base configs that are extended for the states are in the data directory. univ_settings.py, table_defaults.py, target_smart_defaults.py, and candidate_defaults.py are the main configs that are inherited. The specific state data sits in data/voterfiles/<two letter state abbreviation> but the voterfiles directory is omitted for a variety of reasons, not least because it is a couple hundred GB of data.
About
No description, website, or topics provided.
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published
Languages
- Python 99.7%
- Shell 0.3%