GitHub - yourarnav/CS699-SW-Lab: 🚀 CS 699 Project (2023) Setup IX: At the end of Readme.md

Contributors 🧑‍💻

Anuj Attri (23M0808) 👨‍🎓
Arnav Attri (23M0811) 👨‍🎓

Welcome to 🧠 Strokes Uncovered: Data Analysis, Visualization, and Predictive Insights – your gateway to unraveling the mysteries behind strokes! 🧐

Introduction 🚀

The project, titled Strokes Uncovered: Data Analysis, Visualization, and Predictive Insights, is dedicated to conducting exploratory data analysis (EDA), which encompasses various techniques such as histograms, scatter plots, bar charts, and heatmaps. Additionally, it involves data visualization and predictive modeling using a publicly available dataset. Strokes represent a significant global health concern, accounting for approximately 11% of worldwide deaths, as reported by the World Health Organization (WHO). The primary objective of this project is to gain a deeper understanding of the risk factors that influence stroke and to facilitate more effective preventive measures and early interventions.

About the Dataset 📊

The dataset consists of more than 5000 data points and has 10 input features such as (age, hypertension, heart disease, martial status, work type, residence type, average glucose level, BMI, smoking status, gender).

Objectives 📋

Data Analysis: The project will start with a comprehensive data analysis to uncover insights into attribute distributions and relationships. It will address specific questions and hypotheses using statistical methods such as descriptive statistics and hypothesis testing:
- What is the gender distribution in the dataset, and does it impact stroke likelihood?
- Is there a correlation between patient age and stroke risk?
- Does residence type significantly influence stroke risk?
- Are married individuals more or less likely to experience strokes compared to unmarried individuals in the dataset?
Data Visualization: The project will create informative visualizations using Python's Matplotlib and Plotly libraries to effectively convey dataset characteristics, relationships, and trends:
- Age, glucose levels, and BMI will be depicted through histograms and density plots.
- Gender distribution and marital status will be visualized using bar charts.
- Scatter plots will explore attribute relationships with stroke risk.
- An interactive Plotly visualization will offer dynamic dataset exploration.
Stroke Prediction Model: In this phase of the project, we will build a predictive model employing machine learning algorithms. The model's purpose is to discern individuals at higher risk of stroke by analyzing the attributes within our dataset.

Methods and Tools 🛠️

To fulfill project requirements, the following tools and technologies will be employed:

Python: Python and its various libraries will be used for data analysis, visualization, and model construction.
HTML Integration with Python (Flask): We will integrate HTML with Python using the Flask web framework. In this integration, HTML will serve as the front-end interface, while Flask will function as the back-end framework, enabling the creation of an interactive web application. Users can engage with our project through this interface.
LaTeX Integration: LaTeX will play a crucial role in creating a comprehensive and structured report to document our findings and project details.
Pyplot: Pyplot will be used for creating a wide array of data visualizations, including bar charts, line plots, scatter plots, and histograms, to effectively describe our insights and analysis results.
PostgreSQL (Optional): PostgreSQL will be employed for data storage and retrieval, particularly if the dataset size or database management complexity requires it.

Project Documentation 📖

Thorough documentation, including code comments, explanations, dataset sources, data pre-processing details, and model evaluation results, will be carefully drafted.

Conclusion and Impact 🌟

The project's primary objective is to offer valuable key insights into stroke risk factors by finding underlying trends and patterns in the data. By performing data analysis, visualization, and predictive modeling, we can help healthcare professionals and policymakers use these insights to develop targeted prevention strategies, promote healthier lifestyles, and allocate resources more effectively to reduce the burden of strokes on society.

Files in this GitHub Repository

code.ipynb: Jupyter Notebook containing the project code.
environment.yml: Environment file for recreating the project's Python environment.
README.md: You're reading it right now! 😉
requirements.txt: Python package requirements.
stroke.csv: The dataset used for analysis.
Setup steps.txt: Detailed terminal steps for running the project.
CS699 Roadmap: Additional information about the project's roadmap.
CS699 Proposal: From where it all began. 😊

Setup 🛠️

Follow these steps in your terminal to set up the environment:

Navigate to the Project Folder:
- Open your terminal and use the cd command to go to the folder where you cloned or downloaded this repository.
Activate the Conda Environment:
- Activate the Conda environment using the following command:
```
conda activate lightgbm-env
```
Launch Jupyter Notebook:
- Start Jupyter Notebook by running:
```
jupyter notebook
```
Deactivate Conda Environment:
- Once you're done working with the Jupyter (ipynb) notebook, deactivate the Conda environment using:
```
conda deactivate
```

This setup will get you ready to work on your project with the required environment. Enjoy your data exploration and analysis!

Feel free to explore, contribute, and uncover the secrets of strokes with us! 🕵️‍♀️🔍

License: This project is licensed under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 94 Commits
model		model
static		static
templates		templates
.DS_Store		.DS_Store
CS 699 Project Roadmap.pdf		CS 699 Project Roadmap.pdf
CS699 Proposal.pdf		CS699 Proposal.pdf
Proposal.tex		Proposal.tex
README.md		README.md
Roadmap.tex		Roadmap.tex
Setup steps.txt		Setup steps.txt
app.py		app.py
code.ipynb		code.ipynb
environment.yml		environment.yml
requirements.txt		requirements.txt
stroke.csv		stroke.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Contributors 🧑‍💻

Introduction 🚀

About the Dataset 📊

Objectives 📋

Methods and Tools 🛠️

Project Documentation 📖

Conclusion and Impact 🌟

Setup 🛠️

Feel free to explore, contribute, and uncover the secrets of strokes with us! 🕵️‍♀️🔍

About

Uh oh!

Uh oh!

Languages

yourarnav/CS699-SW-Lab

Folders and files

Latest commit

History

Repository files navigation

Contributors 🧑‍💻

Introduction 🚀

About the Dataset 📊

Objectives 📋

Methods and Tools 🛠️

Project Documentation 📖

Conclusion and Impact 🌟

Setup 🛠️

Feel free to explore, contribute, and uncover the secrets of strokes with us! 🕵️‍♀️🔍

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Uh oh!

Languages