Skip to content
#

data-deduplication

Here are 16 public repositories matching this topic...

This repository contains SQL scripts and documentation for cleaning and standardizing data in the NashvilleHousing table within the sqlproject2 database. The project aims to prepare the dataset for analysis by addressing inconsistencies, filling missing values, standardizing formats, and removing duplicates.

  • Updated Jun 17, 2024

The HR Roster Change Detection Pipeline is an automated solution for processing HR roster data. Leveraging Apache Airflow and PostgreSQL, it enables seamless data ingestion, deduplication, and change detection, streamlining HR operations.

  • Updated Dec 4, 2024
  • Python

Improve this page

Add a description, image, and links to the data-deduplication topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the data-deduplication topic, visit your repo's landing page and select "manage topics."

Learn more