Skip to content

Latest commit

 

History

History
15 lines (9 loc) · 1.03 KB

README.md

File metadata and controls

15 lines (9 loc) · 1.03 KB

Tools to Explore Price Transparency Data

This repository is part of a personal project. I'm exploring the price transparency data posted by large insurance companies.

Insurance companies must post transparency data on their websites, but they don't make it easy to access and retrieve the data in an aggregate form.

collectRawDataURLs.py is a script to retrieve the URLs where the data lives. This tool will fetch thousands of URLs to .json.gz files

parseRecords.py is a tool to unzip and parse the large files themselves

Each file is built around a schema recommended, but not required, by CMS. Here's a diagram of part of it, generated by JSON Crack.

schema Figure 1 - Part of the extremely complex schema defined by CMS for JSON files containing price transparency data.