Skip to content
This repository was archived by the owner on Aug 2, 2024. It is now read-only.

CTCL/bip-data

 
 

Repository files navigation

This is essentially the Ballot Information Project. The working branch is in new_nat
bipbuild.py builds the BIP database for each state you pass it. It's mostly based on ersatz: https://github.com/natgaertner/ersatzpg but has a bunch of its own config ideas. You can see some sample state configs in vf_scripts/exception_states.
vf_scripts in general has a ton of quick and dirty script solutions that helped me manage the parallel state directories. make_state_confs.py might be the most important one, since it uses the state_conf_template.py and exception state files to create configurations for each state that reference the correct locations and names.
bip_build.py has a lot of functions, but you usually just run -all_no_clean which ditches the old data, inserts new data, cleans up duplicate districts, merges tables that need to be merged, maps sequential keys correctly, and dumps json.
running -all will actually drop and remake the partitioned tables. This is a little more drastic than you might want.
Generally, the database is built by importing into a buffer table, then performing some SQL commands to square everything away with keying before inserting into the actual table. A lot of the ugly table creation and keying commands are run using functions in schema/table_tools.py. schema/create_partitions.py contains a partition generator that is pretty useful. It also contains a Permutation generator. Did I know about itertools.permutations and itertools.combinations? I did not.
schema/process_schema.py has some attempts to make python classes that represent a database schema as far as I needed it to. Some of the ugliest SQL generation happens in here for rekeying. It could probably get a lot cleaner. At one point I was trying to build a tree of foreign key relations and detect loops and use that to generate rekeying commands after data was loaded in. This became unnecessary, thankfully. We really didn't need a lot of explicit relations, since we were dumping the separate tables out to json anyway.
Some of the base configs that are extended for the states are in the data directory. univ_settings.py, table_defaults.py, target_smart_defaults.py, and candidate_defaults.py are the main configs that are inherited.

The specific state data sits in data/voterfiles/<two letter state abbreviation> but the voterfiles directory is omitted for a variety of reasons, not least because it is a couple hundred GB of data.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 99.7%
  • Shell 0.3%