Avoiding joins safely
Codes for the Monte Carlo simulations to study the effects of different properties of normalized data on ML accuracy. Implementations in R for avoiding joins safely with Naive Bayes and logistic regression with four feature selection methods. For technical details, refer to the paper titled "To Join or Not to Join? Thinking Twice about Joins before Feature Selection" in SIGMOD 2016.