create env
conda create -n wineq python=3.7 -y
activate env
conda activate wineq
created a req file
install the req
pip install -r requirements.txt
download the data from
https://drive.google.com/drive/folders/18zqQiCJVgF7uzXgfbIJ-04zgz1ItNfF5?usp=sharing
git init
dvc init
dvc add data_given/winequality.csv
git add .
git commit -m "first commit"
oneliner updates for readme
git add . && git commit -m "update Readme.md"
git remote add origin https://github.com/coolmunzi/webapp_mlops.git
git branch -M main
git push origin main
Add/ Update stages:
- get_data.py: Involves data capture from csv files and create dataframe
- load_data.py: Load the captured data, process it and store the processed data as csv file
- split_Data.py: Splits the total dataset into training and testing chunks
- train_and_evaluate.py: Train the model and evaluate the model performance
Update stages in dvc.yaml
Add all stages to dvc for tracking
dvc repro
To see the model evaluation metrics from dvc
dvc metrics show
If you change the hyper parameters and later on would like to compare the hyper-parameters os all experiments
dvc metrics diff
Add/update testing files: init.py, conftest.py, schema_in.json and test_config.py inside tests directory NOTE: Testing can be done using pytest (via pytest -v) or using tox.
Create schema_in.json indicating min and max values for all the columns using following command:
import pandas as pd
df = pd.read_csv('data_given/winequality.csv')
overview = df.describe()
overview.loc[ ["min", "max"] ].to_json("schema_in.json")
To run mlflow server
mlflow server --backend-store-uri sqlite:///mlflow.db --default-artifact-root ./artifacts --host 0.0.0.0 -p 1234
To run tests using tox, add/update tox.ini file.
tox command to run tests:
tox
For rebuilding the testing environment when there is change in requirements -
tox -r
Create setup.py to make package from src. After adding/updating setup.py, execute following command to create src package.
pip install -e .
To build wheel file for src package (Do this if you really need wheel file)
python setup.py sdist bdist_wheel
For CI-CD workflow, add/update ci-cd.yaml file under .github/workflows which manages github actions
Create a new webapp in Heroku and connect it with your github. Choose Automatic Deploy in Heroku and enable "Wait for CI to pass before deploy". Create HEROKU_APP_NAME & HEROKU_API_TOKEN secrets in the github. (NOTE: generate heroku api tokens from applications -> create authorization -> define api token).
This app is deployed on https://wine-quality-analysis.herokuapp.com/ with CI-CD pipeline.