This project implements a dual-model approach for heart failure prediction using both clinical data and MRI images. The system combines a Random Forest model for analyzing clinical parameters and a Convolutional Neural Network (CNN) for processing medical imaging data, providing medical professionals with comprehensive diagnostic support.
- Dual Prediction Methods:
- Clinical data analysis using Random Forest
- MRI image analysis using CNN
- Interactive Web Interface:
- User-friendly Streamlit application
- Real-time predictions
heart_failure_pred/
├── artifacts/
│ ├── data/
│ │ ├── img_data/
│ │ │ ├── normal/
│ │ │ └── failure/
│ │ └── tabular_data/
│ │ └── heart_failure.csv
│ └── models/
│ ├── cnn/
│ │ ├── cnn_model.keras
│ │ └── cnn_score.json
│ └── rf/
│ ├── rf_model.pkl
│ └── rf_score.json
├── main.py
├── utils.py
├── app.py
├── requirements.txt
└── README.md
- Python 3.8 or higher
- pip package manager
- Clone the repository:
git clone https://github.com/AryanDhanuka10/Heart-Failure-Prediction.git
cd heart_failure_pred- Create and activate a virtual environment:
python -m venv venv
source venv/bin/activate # On Windows, use: venv\Scripts\activate- Install required packages:
pip install -r requirements.txtTo train both the Random Forest and CNN models:
python main.pyThis script will:
- Load and preprocess the data
- Train the Random Forest model with hyperparameter tuning
- Train the CNN model
- Save the trained models and their performance metrics
streamlit run app.py- Navigate to the "Clinical Data Prediction" tab
- Enter the required clinical parameters:
- Age
- Anaemia status
- Creatinine phosphokinase level
- Diabetes status
- Ejection fraction
- High blood pressure status
- Platelets count
- Serum creatinine level
- Serum sodium level
- Sex
- Smoking status
- Follow-up period
- Click "Predict" to get results
- Navigate to the "MRI Image Prediction" tab
- Upload a DICOM format MRI image
- View the prediction results
- All numerical inputs should be non-negative
- Binary inputs (anaemia, diabetes, high blood pressure, sex, smoking) use 0/1 encoding
- Input ranges:
- Age: 0-100 years
- Ejection Fraction: 0-100%
- Platelets: typical range 150,000-450,000 per μL
- Time: measured in days
- Format: DICOM (.dcm)
- Processed to 224x224 pixels
- Normalized to [0,1] range
- Features: 12 clinical parameters
- Hyperparameter tuning via GridSearchCV
- Optimization metric: Accuracy
- Output: Binary classification with probability scores
- Architecture: Sequential model with multiple convolutional layers
- Input shape: (224, 224, 1)
- Training: Adam optimizer with sparse categorical crossentropy
- Output: Binary classification (normal/failure)
The system saves performance metrics for both models:
- For Random Forest: accuracy, F1 score, and detailed classification report
- For CNN: loss and accuracy metrics
Please read CONTRIBUTING.md for details on our code of conduct and the process for submitting pull requests.
