Project assignment during my internship with One Campus Academy on Heart Disease Project : Heart Disease Analysis
This project requires that you work with a dataset from the medical industry. This dataset is called the Heart Disease dataset and has been published in the UCI Machine Learning Repository. This dataset originally contained 75 attributes, but only 14 of those attributes have been used by published experiments, so we will also be using this subset for our data analysis.
The dataset uses a lot of medical terminologies that you may be unfamiliar with. You are required to check for outliers, missing values, and the trends and relationships between different features of the dataset (using univariate and bivariate techniques) to gain a better understanding of the available data and derive useful insights from it.
The goal of this dataset is to train a model so that it predicts whether a person is likely to suffer from heart disease (whether the probability is above or below 50%); however, your task is to observe and analyze the distribution of the data, search for outliers and missing values, and assess the relationships between features.