In this project, we are going to investigate a dataset which contains information of 1599 red wine samples with a variety of attributes such as fixed acidity, sulfur dioxide, pH and alcohol. Each sample is assigned a quality rating from 0 to 10.
- Create plots and calculate descriptive statistics for each indidivual attribute to find out about its distribution and other note-worthy characteristics.
- Make some observations regarding the structure and main features of interest of our dataset, as well as other attributes that are most likely to influence the quality rating of red wine samples.
- Examine the relationship between any two attributes through a correlation matrix and scatterplot matrix. Pairs of attributes that are deemed to have at least moderate correlation shall be explored further using plots and other mathematical tools such as Pearson product-moment correlation.
- Analyze and make conclusions on the correlation between these attributes, including some very interesting and/or strong relationships and hidden insights that we have found.
- Dive deeper into the red wine dataset by examining the relationships between multiple features at the same time by making use of density and scatter plots.
- Report on our findings of some significant relationships that we have identified in this section. A number of surprising interactions between features that have not been noticed in earlier parts are also mentioned.
- Three most important findings throughout our project, including their accompanying plots and details are gathered together and documented in this last section.
- Some final reflections on our effort, most notably challenges that we faced and managed to overcome in order to create a meaningful and comprehensive report on time for our readers.