NERSC Chatbot Deployment

Deploy Hugging Face large language models (LLMs) on NERSC supercomputers using Slurm and the vLLM serving framework. This package supports both command-line interface (CLI) and Python library usage, with utilities for seamless Gradio integration on NERSC JupyterHub.

The deployed models expose an OpenAI-compatible API endpoint powered by vLLM, enabling easy integration with existing OpenAI clients and tools while requiring a secure API key to control and restrict access.

Quick Start

Install the package:

module load python
python -m pip install git+https://github.com/NERSC/nersc_chatbot_deploy

Deploy a model using the CLI:

nersc-chat -A your_account -m meta-llama/Llama-3.1-8B-Instruct
# Use `nersc-chat --help` for more options

Or deploy using the Python library:

from nersc_chatbot_deploy import deploy_llm

proc, api_key = deploy_llm(
    account='your_account',
    num_gpus=1,
    queue='shared_interactive',
    time='01:00:00',
    job_name='vLLM_test',
    model='meta-llama/Llama-3.1-8B-Instruct'
)

Features

Deploy LLMs on NERSC Slurm clusters with GPU allocation
Monitor Slurm jobs and deployed services
Embed Gradio UIs inline within Jupyter notebooks on NERSC JupyterHub

Installation & Prerequisites

Active NERSC account with permissions to run jobs on Perlmutter
Optional: Hugging Face access token set as HF_TOKEN environment variable for certain models

Usage

For detailed usage instructions, including CLI options and Python library examples, please refer to the Docs page.

Security Best Practices

Protect API Keys and Tokens: Always store API keys and tokens securely, preferably in environment variables. Avoid hard-coding them in code or sharing publicly.
Use Trusted Models: Only download and deploy models from reputable and verified sources, such as official Hugging Face repositories or other trusted providers. Verify the integrity and authenticity of the models to avoid potential security risks, such as malicious code or data leaks.
Respect Data Privacy: Avoid using sensitive or personal data unless absolutely necessary, and ensure compliance with data privacy regulations.
Follow Licensing and IP Rights: Comply with all model licenses and institutional policies when deploying and using models.
Restrict Access: Limit access to deployed services by enforcing API key authentication and network-level restrictions where possible.

Troubleshooting

Ensure your NERSC account has proper permissions.
Verify Slurm queue and constraints match your allocation.
Check logs for errors; adjust --log-level for more verbosity.
Confirm network access for Gradio proxy URLs on JupyterHub.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
docs		docs
src/nersc_chatbot_deploy		src/nersc_chatbot_deploy
.flake8		.flake8
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

NERSC Chatbot Deployment

Quick Start

Features

Installation & Prerequisites

Usage

Security Best Practices

Troubleshooting

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

NERSC/nersc_chatbot_deploy

Folders and files

Latest commit

History

Repository files navigation

NERSC Chatbot Deployment

Quick Start

Features

Installation & Prerequisites

Usage

Security Best Practices

Troubleshooting

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages