Skip to content

Welcome! This project demonstrates the deployment of a ML-based API. The cloud provider Render sets up the environment and hosts the docker container. Click the link below to see the site!

License

Notifications You must be signed in to change notification settings

weezymatt/Spam-Detection-backend

Repository files navigation

Spam-Detection-with-FastAPI

welcome-goodbye

Note Access to nlp documentation here

Last updated December 28th 2023.

Note The true scope of this project involves the full implementation and integration of the classifier into a website. This repository details the backend side of machine learning.

This machine learning (ML) project was created for the purpose of deploying a ML-based API model by using the cloud provider Render to set up the environment and host the docker container. The swagger documentation of the API is available here for viewing.

Table of Contents

Objective

The objective of this repository is to use a prediction model—that will accurately classify texts as spam—using the FastAPI framework and serve as an educational experience. Happy coding!

The original dataset can be found here.

hannes-johnson-mRgffV3Hc6c-unsplash Photo by Hannes Johnson on Unsplash

MLOps — Where does my model go from here?

The "Spam Detection backend" project aims to provide a brief outline of Machine Learning Operations (MLOps) by focusing on certain steps necessary to deploy your ML model. Below I have provided helpful documentation that allowed me to complete this project.

Helpful Resources

  1. Machine Learning Mastery: Save and Load ML Models in Python

    This article you will discover how to save and load your model with pickle or joblib. You will then be able to reuse your saved file to make predictions at this stage.

  2. Integrating ML classifier with FastAPI

    Explanation of the overall backend pipeline for ML models. Granted, I did not follow this article much but the main.py file provides a simple overview of how integrating your saved model with FastAPI looks like.

  3. Building a Machine Learning API in 15 Minutes

    Very useful video on how an API project may be deployed! Additionally, you can run the application with uvicorn app:app --reload at this stage.

  4. FastAPI in Containers - Docker

    Another helpful tutorial that demonstrates the purpose of Docker and details how to create a Dockerfile. You can build the Docker image and start the container at this stage.

  5. Share the Application - Docker

    After building your Docker image, you can share it with Docker Hub. The purpose of sharing allows for easy integration into a cloud environment and demonstrates the portablility of containers. You can run your application on a hosted site at this stage.

Project Setup

Note: You may ignore this section if you only interested in deploying the model. The commands below can be copied and run in your terminal to easily simulate my project environment.

To set up the project environment locally, follow these steps:

  1. Cloning the Repository
git clone https://github.com/weezymatt/Spam-Detection-backend.git
cd spam-detection-backend
  1. Setting up Virtual Environment
  • Windows
    python -m venv <virtual-environment-name>
    venv\Scripts\activate
  • Linux and MacOS
    python3 -m venv <virtual-environment-name>
    source env/bin/activate
  1. Install the Required Dependencies

    The virtual environment will make use of its own pip, so you don't need to use pip3.

    pip install -r requirements.txt

Instructions to build FastAPI app

Note: You may ignore this section and go to Deployment if you're knowledgeable about APIs. The Jupyter Notebook is useful for using your model to test predictions before makig the app.py file.

There are a few steps required to build your FastAPI app and capture the essence of your model. Here we briefly discuss how to write the app.py code and the Dockerfile.

Create the API

Following the initialization of your virtual environment, we will write the app.py file and initialize an instance of FastAPI.

app = FastAPI(title="Ham or Spam API", description="API to predict SMS spam")

Load the saved models with joblib. A vectorizer is loaded to follow the same processing steps in the Jupyter Notebook.

model = joblib.load("model/finalized_model.sav")
vectorizer = joblib.load("model/vectorizer.sav")

Define the data format for incoming input.

class request_body(BaseModel):
	message   : str # A free service for you ONLY!! Please click on the link now! // String value

Process the input sent by the user.

def process_msg(msg):
    """
    Replace email address with 'email'
    Replace URLS with 'http'
    Replace currency symbols with 'moneysymb'
    Replace phone numbers with 'phonenumb'
    Replace numbers with 'numb'
    """
    ... 

    return clean_input

Define the GET method.

@app.get('/')
def Welcome():
	return{'message': 'Welcome to the Spam classifier API!'}

Create the POST method (this is the meat of your API).

@app.post('/api_predict')
def classify_msg(msg : request_body):
	
	if (not (msg.message)):
		raise HTTPException(status_code=400, detail="Please provide a valid message")

	# Process the message to fit with the model
	dense = process_msg(msg.message)

	# classification results
	label = model.predict(dense)[0]
	# proba = model.predict_proba(dense) // check again after test

	# extract the corresponding information
	if label == 0:
		return {'Answer': "This is a Ham email!"}
	else:
		return {'Answer': "This is a Spam email!"}

Write the requirements.txt

Realistically you have a virtual environment ready and install your dependencies throughout the project then freeze them into a text file.

The requirements.txt file enables us to recreate all the modules necessary for our application. This is crucial when we write our Dockerfile later on.

pip freeze > requirements.txt

Deactivate your virtual environment.

deactivate

Deployment

Here we develop the deployment in stages until we reach the container step where we are able to display the webpage.

FastAPI Deployment

  1. Open the terminal and navigate to the directory where your app.py file is located.

  2. Run the FastAPI application by using the uvicorn command, specifying the application name. The --reload feature is useful for changes to be automatically reflected.

    uvicorn <application-file>:app --reload
  3. After running the uvicorn command, the FastAPI application is up and running on the specific address (i.e. http://localhost:8000) listed. This address represents the API endpoint where we can access our application. We will see the importance of this endpoint during the front-end part of the project.

  4. You may open your browser to interact with your deployed FastAPI application. The endpoint acts as an intermediary between requests and responses (Press CTRL+C to quit).

API Documentation

The FastAPI Documentation details the available endpoints, JSON request and response formats, and information specified in your app.py file. You can access this documentation by adding the /docs to the server (http://localhost:8000/docs).

Spam-endpoint api-predict

Containerized Deployment

When deploying an API a common approach is to build a container image. We will be needing to write a Dockerfile for the application.

Dockerfile

Dependency Issue: For the Docker container to properly run, an additional file initializing the NLTK stopwords was incorporated into the workflow. This may not be necessary in your process.

FROM python:3.11.5

WORKDIR /code

COPY ./requirements.txt /code/requirements.txt

RUN pip install --no-cache-dir --upgrade -r /code/requirements.txt

COPY ./initialize.py /code/initialize.py
RUN python3 /code/initialize.py

COPY . .

CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "80"]

Docker Image

If you are using an ARM-based Mac with Apple Silicon, you will need to rebuild the image to be compatible and push the new image to your repository on Docker Hub. Otherwise the process is rather straightforward. Click here for a solution on Stack Overflow.

Let's build the container image. Use the docker tag command to give the image a new name.

docker tag <image-title> YOUR-USERNAME/<dockerhub-repo>

Switch to a new driver before your build (we will be following the process for an M1+).

docker buildx create --use

Launch the following command to build your Docker image.

docker buildx build --platform linux/amd64,linux/arm64 -t <tag> .

Push the image to Docker Hub.

docker buildx build --push --platform linux/amd64,linux/arm64 -t docker.io/YOUR-USERNAME/<dockerhub-repo>:latest .

You may run your image to verify it is working and visit the server (http://localhost:8000/docs).

docker run -d --name mycontainer -p 8000:80 <image-title>

See More

There you have it! You can use your saved image on Docker Hub with your cloud environment of choice and start the next step of your application. For the second part of this project involving the front-end piece, please click here.

About

Welcome! This project demonstrates the deployment of a ML-based API. The cloud provider Render sets up the environment and hosts the docker container. Click the link below to see the site!

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published