Many MLOps issues are caused by data-related problems. Unfortunately, hype surrounding algorithms overshadows them although they cause data leakage and pipeline jungle, leading to model failure. I propose a streamlined and adaptive data-centric ML pipeline for a domain like hardware verification, where schemas are absent, data types are inaccurate, and data drift is extreme (shape and type changes). Here, schemas are inferred from raw data, and used for monitoring and preprocessing. During serving, schema mismatches are resolved, which increases robustness. It also easily allows “data tuning” (preprocessing optimization), which improved model performance in real-world benchmark testing.