Skip to content
This repository has been archived by the owner on Apr 10, 2024. It is now read-only.

Latest commit

 

History

History
69 lines (63 loc) · 3.16 KB

README.md

File metadata and controls

69 lines (63 loc) · 3.16 KB

Streamlit Demo

This repository contains files for the Project Website Workshop for DataHacks 2024. This workshop is targeted towards students who want to create website for their project and have a working model embedded inside.

Installation

  1. (Fork and) Clone the repo.
  2. Create a virtual environment and install required packages as specified in the requirements.txt.
python3 -m venv your_env
source your_env/bin/activate  # 'source your_env/Scripts/activate' if on Windows
pip install -r requirements.txt
  1. You may need to install streamlit and joblib at this time. Even though it is in the requirements.txt it sometimes will not install.
pip install streamlit joblib
  1. Launch you local server by running the following.
streamlit run main.py

Saving Models for Production

An example of this being done is under model-dev/model-dev.ipynb.

  1. You may be working on developing your model in your Jupyter Notebook. You will need to import joblib in order to save your model to a file.
# https://joblib.readthedocs.io/en/stable/
import joblib
  1. In this case, we are serializing (saving) a Scikit-Learn pipeline using a RFC based on the famous Titanic dataset. Use dump to save your model.
# dump( <YOUR_ML_MODEL_OBJ>, <FILENAME> )
joblib.dump(pipeline, "RandomForestClassifier.pkl")
  1. Find what version of the package of the model you are using. In this case I am using Scikit-Learn so I need to find out which version of Scikit-Learn I used in my notebook.
!pip list | grep scikit-learn
# scikit-learn  1.2.1

Using Pre-trained Models in Streamlit

An example of this being done is under main.py starting from line 63.

  1. In some but not all cases, the version in your virtual environment may not the one in the Jupyter Notebook. This is especially common if you are using a Conda environment. Ensure in your virtual environment that you have the same version of your ML package used in your notebook. If they do not match, uninstall the package and re-install it using the correct version.
pip uninstall scikit-learn
pip install scikit-learn==1.2.1
  1. In your Streamlit file import joblib. Use the load function to load your model into an variable. From there, you can treat it as the same as the model from the notebook.
import joblib

# load( <FILEPATH_TO_MODEL> )
pipeline = joblib.load('./model-dev/RandomForestClassifier.pkl')
  1. Generate user inputs that are inline with the model inputs. Use that to predict. An example using the Titanic pipeline is below
# Get and validate user input
pclass = st.radio('Pclass', [1, 2, 3])
sex = st.radio('Sex', ['male', 'female'])
age = st.number_input('Age')
sibsp = st.number_input('SibSp', step=1)
parch = st.number_input('Parch', step=1)
fare = st.number_input('Fare')
embarked  = st.radio('Embarked', ['C', 'Q', 'S'])
  1. Deploy using the Deploy button at the top right.

License

This project is licensed under the MIT License. See LICENSE for more information.