This repository contains the code for experimental analysis of data deletion in learned database systems. For this purpose, we study four different learned database systems:
DBEst++: Approximate Query Processing using Mixture Density Networks,
Naru: Cardinality Estimation using Deep Autoregressive Networks
TVAE: Data Generation using Tabular Variational AutoEncoders
Classification: Tabular data classification using deep neural networks such as ResNet.
To start installing the packages, run the environmental setup for each application.
bash ./environments/dbest/setup.sh
bash ./environments/naru/setup.sh
bash ./environments/tvae/setup.sh
bash ./environments/tcls/setup.sh
We use three real-world datasets for our evaluations: Census, Forest, and DMV. You can download the versions we use from here
To increase reproducibility, we have created experimental pipelines for each applications. For census, and forest, you can find related exp_census.py and exp_forest.py scripts. For dmv, there are bash scripts for each dataset to run training/evaluating commands.
We have used the codes from the below repositories which are the official implementations of the applications we have studied.