WaterQualityAnalyses

This project aims to predict water quality based on various parameters. The system includes a web-based form allowing users to input specific water quality parameters, and a machine learning model implemented in Python and Flask predicts whether the water is of good quality or not.

Usage

Web Application

Open the index.html file in a web browser.
Fill in the form with the relevant parameters, including pH, hardness, solids, chloramines, sulfate, conductivity, organic carbon, trihalomethanes, and turbidity.
Click the "Submit" button to send the data to the server for prediction.
The prediction result will be displayed on the page, indicating whether the water is of good quality or not.

Python Application

Ensure you have the required Python libraries installed. You can install them using:
```
pip install flask numpy scikit-learn
```
Run the Flask application:
```
python app.py
```
Open a web browser and go to http://127.0.0.1:5000/ to access the web form.
Follow steps 2-4 from the "Web Application" section above.

Model Training

The machine learning model used for prediction is a Random Forest Classifier. The model is trained using a dataset (processed_data.csv) that includes water quality parameters and a target variable indicating water potability.

To retrain the model:

Open a Jupyter notebook or Python environment.
Execute the code in the "Model Training" section of the provided Python script (model_training.py).
The trained model will be saved to a file named water_model.pkl.

Data Preprocessing

The dataset used for training may contain missing values, which are handled by imputing the mean values for the respective columns. Additionally, exploratory data analysis (EDA) is performed, and correlations between variables are visualized using various plots and the corrplot library in R.

Dependencies

Flask
NumPy
Scikit-learn
Pandas
Pickle
ggplot2 (R package)
corrplot (R package)
skimr (R package)

Additional Notes

The style.css file contains styles for the HTML elements.
The application includes visualizations generated using ggplot2 and corrplot in R. These visualizations provide insights into the dataset's distribution and correlations between variables.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
templates		templates
.gitignore		.gitignore
README.md		README.md
WaterQuality (1).ipynb		WaterQuality (1).ipynb
app.py		app.py
knn_model.py		knn_model.py
logistic_model.py		logistic_model.py
model.py		model.py
processed_data.csv		processed_data.csv
water_model.pkl		water_model.pkl

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

WaterQualityAnalyses

Usage

Web Application

Python Application

Model Training

Data Preprocessing

Dependencies

Additional Notes

About

Releases

Packages

Contributors 2

Languages

cleo-cyber/WaterQualityAnalyses

Folders and files

Latest commit

History

Repository files navigation

WaterQualityAnalyses

Usage

Web Application

Python Application

Model Training

Data Preprocessing

Dependencies

Additional Notes

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages