Classification of Cancer Dataset by Logistic Regression

Overview

This is an example of a machine learning implementation using logistic regression for classification. In this case, a dataset consisting of 569 instances of cancer is employed to predict whether each instance is malignant (labeled 0) or benign (labeled 1), based on 30 features of each instance. This problem is referred to as a two-class problem, given its task of distinguishing between malignant and benign cases.

Installation

Prerequisites

Jupyter Notebook
NumPy
Pandas
scikit-learn

Setup

From the above <> code button, you can download the ipynb file in JSON format. Then open it in Jupyter Notebook, you can reproduce exactly the same outputs.

Method

First, the dataset is split into training and test data. Next, created a classifier and trained using the training data. Here, we create a logistic regression classifier. Based on the training outputs, we make predictions on the test data. We then check how well the predictions match the original test data.

Statistical learning is performed using feature values such as cancer radius and surface textures, enabling predictions to be made for unknown data, such as whether it is malignant.

Result

In this example, 8 out of 171 were found to be wrong. Since the accuracy seems to be quite high, it seems that there were good features to begin with and the classification problem was not that difficult.

References

Tamaki Toru. 2020. Machine-Learning by Python; An introduction to classification with scikit-learn. Udemy.

Name		Name	Last commit message	Last commit date
Latest commit History 55 Commits
README.md		README.md
data.csv		data.csv
main.py		main.py
scikit-learn 1 classification of cancer dateaset.ipynb		scikit-learn 1 classification of cancer dateaset.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Classification of Cancer Dataset by Logistic Regression

Overview

Installation

Prerequisites

Setup

Method

Result

References

About

Releases

Packages

Languages

keita-sa/classification_of_cancer_dataset

Folders and files

Latest commit

History

Repository files navigation

Classification of Cancer Dataset by Logistic Regression

Overview

Installation

Prerequisites

Setup

Method

Result

References

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages