This repository has the activities carried out in the Advanced Data Mining Course in the Master Degree of Data Science coursed in UOC (Open University of Catalunya). It contains different jupyter notebooks covering different and interesting topics in Data Mining, Artificial Inteligence and Data Processing.
Throughout this practice we will see how to apply different techniques for data loading and preparation:
- Loading a data set
- Data analysis
- Basic statistical analysis
- Exploratory data analysis
- Dimensionality reduction (sklearn PCA)
- Training and testing
Throughout this activity we will see how to apply different unsupervised techniques as well as some of their real applications:
-
Clustering with different strategies: k-means and elbow rule, density-based and hierarchical.
-
Optimization with dimensionality reduction: t-SNE.
-
Application: identification of tourist points of interest.
Throughout this activity we will see how to apply different supervised algorithms as well as some of their real applications:
-
kNN (K Nearest Neighbors): User rating of a given APP.
-
SVM (Support Vector Machine): Face recognition.
-
Decision Trees: Prediction of child seat sales.
-
Naive-Bayes: Buying or renting a house? What is more convenient for me?
-
And finally we will study the different decision boundaries of the different algorithms.
This practice is divided into two parts:
-
In the first exercise we will see how to decompose and compose time series to make forward predictions.
-
In the second exercise we will study different methods of combining classifiers.