Skip to content

Latest commit

 

History

History
24 lines (13 loc) · 874 Bytes

README.md

File metadata and controls

24 lines (13 loc) · 874 Bytes

The Big Data Club Diamond Challenge

This repository contains a number of challenges for Big Data Club members to test their data analysis and machine learning skills. This challenge uses data located at https://www.kaggle.com/shivam2503/diamonds. For more information, see the notebook in the root folder.

Data Attributes

price - price in US dollars ($326--$18,823)

carat - weight of the diamond (0.2--5.01)

cut - quality of the cut (Fair, Good, Very Good, Premium, Ideal)

color - diamond colour, from J (worst) to D (best)

clarity - a measurement of how clear the diamond is (I1 (worst), SI2, SI1, VS2, VS1, VVS2, VVS1, IF (best))

x - length in mm (0--10.74)

y - width in mm (0--58.9)

z - depth in mm (0--31.8)

depth - total depth percentage = z / mean(x, y) = 2 * z / (x + y) (43--79)

table - width of top of diamond relative to widest point (43--95)