Dengue Cases Analysis and Clustering

Introduction

This project analyzes and visualizes dengue case data in the Philippines from 2016 to 2020. By applying K-Means clustering techniques, the project aims to identify patterns and trends in dengue cases and deaths across different regions. Understanding these patterns is crucial for preventing the spread of dengue and addressing the healthcare needs of the community effectively.

Project Structure

Dengue-analysis-clustering: The main Jupyter notebook containing all steps of the analysis, including data loading, preprocessing, exploratory data analysis (EDA), clustering, and evaluation.
ph_dengue_cases2016-2020.csv: The dataset used for the analysis (ensure to include this file if sharing the dataset is permissible).

Steps and Analysis

Data Loading and Initial Inspection:
- Load the dataset and inspect its structure and contents.
Data Formatting and Cleaning:
- Check for null values and data types of each column.
- Convert the 'Month' column to a categorical datatype and encode the 'Region' column.
Descriptive Statistics:
- Generate summary statistics to understand the dataset's distribution.
Trend Visualization:
- Visualize trends in dengue cases and deaths over the years to identify patterns and anomalies.
Clustering Analysis:
- Apply K-Means clustering to group regions based on dengue cases and deaths.
- Use the Elbow Method to determine the optimal number of clusters.
Cluster Visualization:
- Visualize the clusters to identify regional patterns.
Model Evaluation:
- Evaluate the performance of the clustering using the Calinski-Harabasz Index.

Results

The analysis revealed distinct groupings of regions based on dengue cases and deaths. Regions with the highest cases and deaths were identified, indicating the need for extensive public health measures in these areas. The clustering results provide valuable insights for designing targeted interventions to combat dengue.

Conclusion

This project demonstrates the application of data analysis and clustering techniques to understand the distribution and impact of dengue cases in the Philippines. The insights derived from the analysis can help public health officials allocate resources more effectively and design targeted interventions to reduce the incidence of dengue.

Requirements

Python 3.x
Jupyter Notebook
Libraries: pandas, numpy, matplotlib, seaborn, scikit-learn

How to Run

Clone this repository:

git clone https://github.com/yourusername/dengue-cases-analysis.git

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
images		images
Dengue-analysis-clustering.ipynb		Dengue-analysis-clustering.ipynb
README.md		README.md
ph_dengue_cases2016-2020.csv		ph_dengue_cases2016-2020.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Dengue Cases Analysis and Clustering

Introduction

Project Structure

Steps and Analysis

Results

Conclusion

Requirements

How to Run

About

Releases

Packages

Languages

RyanAncheta/Dengue-Cases-Clustering

Folders and files

Latest commit

History

Repository files navigation

Dengue Cases Analysis and Clustering

Introduction

Project Structure

Steps and Analysis

Results

Conclusion

Requirements

How to Run

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages