Skip to content

Tools to scrape and centralize the text of meeting agendas & minutes from local city governments. NOT ACTIVE -- looking for new lead(s)!

Notifications You must be signed in to change notification settings

Data4Democracy/town-council

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

46 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

town-council

Tools to scrape and centralize the text of meeting agendas & minutes from local city governments.

Slack: #p-town-council

Project Description: Engagement in local government is limited not only by physical access to city council meetings, but also electronic barriers including difficult-to-navigate web portals and the frequent use of scanned (non-text searchable) .pdf documents. That is, council meetings and their outcomes are not easily tracked by local constituents, journalists, and policy advocates.

Moreover, no tools exist to support the comparison of local government issues between cities.

We aim to provide a publicly available database that automatically scrapes and aggregates the text from city council agendas and minutes, towards the goals of: (1) promoting local government accessibility/transparency and (2) establishing open-source data/software resources to track and analyze trends in local governments.

Project Status / Scope

As of August 2017, this project is on hold -- new co-leads needed. While the existing project members feel that this database is extremely valuable, we unfortunately don't have the time to maintain it at present. We're currently looking for new leads interested in picking up the project.

A rough draft of our infrastructure is shown here. Stack: Python 3, Scrapy 1.4, postgresql.

We have completed scrapers for approximately a dozen cities in the San Francisco Bay Area as initial case study (selected in partnership with activists researching the Bay Area housing crisis; see list of cities), including some general scrapers that work with common content management systems used by cities (e.g., see our Legistar scraper).

We have also successfully automated the document retrieval process (i.e., downloading the agenda and minutes .pdf files; code in pipeline folder).

Work-in-progress included investigating tools to extract the text from the said documents, as well as publicly setting up the database (AWS/Azure). Long-term goals include a front-end interface / search for users with less technical background.

Contributing

Project Co-Leads: @chooliu / @bstarling / TBD

Again, this project is on hold due to limited availability of the current co-leads: please let us know if you'd like to help lead #p-town-council!

To join, just post in the Slack channel (#p-town-council) or contact one of the leads following general D4D onboarding.

For volunteers interested in writing scrapers/helping out with initial development, as a first step install , then try to run one of our scrapers using our "council crawler" readme.md.

Skills:

Volunteers with backgrounds in and/or interest in learning the following are highly desired:

  • web scraping
  • .pdf scraping / OCR
  • database management
  • natural language processing / text wranglers

At present, our focus is to develop the database infrastructure (folks with web scraping experience highly desired!), but also welcome researchers/analysts interested in local politics and downstream analyses with the data.

Future analyses enabled with this database may include:

  • Counting mentions of large organizations (lobbyist, think-tanks, corporations) in local meetings.
  • Mapping concern for state/national issues (e.g., Affordable Health Care for America Act) at high resolution by pairing with local demographic metadata (e.g., political affiliation, median income).

Related Work

While there are attempts to do this at the state and federal level (shoutout to organizations like 4US, Digital Democracy, GovTrack and the Open States Project), no similar resource yet exists for local governments.

However, we encourage those interested in learning more about local government / data science tools for civic tech to explore the great foundational work of Open Civic Data and Councilmatic.

About

Tools to scrape and centralize the text of meeting agendas & minutes from local city governments. NOT ACTIVE -- looking for new lead(s)!

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages