Skip to content

Latest commit

 

History

History
61 lines (48 loc) · 3.31 KB

README.md

File metadata and controls

61 lines (48 loc) · 3.31 KB

awesome-data-quality

A curated list of awesome tools for testing and monitoring data quality - typically at the data warehouse/lake or within running data pipelines.

If you want to contribute to this list (please do), send me a pull request or contact me.

Table of Contents

TBD

Frameworks and Libraries

Open sourced

Geared for ML
  • deepchecks - tool for validating your machine learning models and data. Implemented test suites tailored towards ML models datasets and outputs.
  • evidently - analyze and track data and ML model output quality.
Pipelines with data quality included
  • dbt, dataform - ELT tools that comes with a handy utility to define tests as SQL queries.

Paid

Offering ranges from data to pipelines testing, with focus on real-time monitoring, automation of tests creation & threshold setting, and addditional enterprise features.

TODOs

  • Add tools for unstructured data (Arthur, Robust)