I performed a commentary analysis and observational study on the Female Pima Indians Dataset for Diabetes. Details about the data-set itself is available at here. Specific details about the variables themselves, and the sampling of the records from volunteers, etc have been included in the RPubs document (Tab 1.1)
- Part 1 is all about a statistical approach to the data-set in R, exploring the data and proposing and finding the best statistical models possible to predict if a Pima Indian female has diabetes or not.
- Part 2 is about using a k-NN classifier and a logistic regression classifier(yet again) for the prediction, in Python.
The RPub document is actually a published RMarkdown document, generated using the "knitr" package in RStudio, the go-to IDE with R.
Find the complete project hosted here: link