Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New documentation site for Seldon Core v2 #5760

Merged
merged 33 commits into from
Sep 9, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
f39f911
moved docs out of the source directory and removed spnix-related files
ramonpzg Jul 16, 2024
669bbf1
APIs section completed
ramonpzg Jul 16, 2024
05321e8
changing the configuration section in the getting started guide
ramonpzg Jul 18, 2024
4c0b628
getting started sectionc completed
ramonpzg Aug 8, 2024
5e3f404
rearranged models directory and enhanced different docs
ramonpzg Aug 13, 2024
d818bf9
added most images in the docs to the images directory
ramonpzg Aug 13, 2024
7bfe2e6
moved outliers and drift docs to its own file in the root directory
ramonpzg Aug 13, 2024
a837d33
deleted servers directory and moved servers.md to the root directory …
ramonpzg Aug 13, 2024
f9c6a95
deleted pipelines dir and moved pipelines.md to the root directory
ramonpzg Aug 13, 2024
acc7d7c
deleted inference dir and moved inference.md to the root directory
ramonpzg Aug 13, 2024
f502be2
deleted explainers dir and moved explainers.md to the root directory
ramonpzg Aug 13, 2024
cfdae58
deleted performance-tests dir and moved .md to the root directory
ramonpzg Aug 13, 2024
55620fc
deleted experiments dir and moved .md to the root directory
ramonpzg Aug 13, 2024
1fbd366
updated about section to match gitbook's expected format
ramonpzg Aug 13, 2024
6387791
updated FAQs section to match gitbook's expected format
ramonpzg Aug 13, 2024
ae1d6e3
updated pandas query section with choice1.yaml
ramonpzg Aug 13, 2024
c57384a
mostly moved and renamed files and directories
ramonpzg Aug 13, 2024
5e3cbbf
updated SUMMARY.md for GitBook
ramonpzg Aug 13, 2024
49a44cf
adding additional images
ramonpzg Aug 21, 2024
a7e3590
restructured development dir
ramonpzg Aug 21, 2024
25fcba5
restructured and reformatted examples dir to match GitBook's md flavor
ramonpzg Aug 21, 2024
d1cae55
added gitbook format to metrics dir
ramonpzg Aug 21, 2024
a9c3227
restructured k8s directory to match GitBook's expected md flavor
ramonpzg Aug 21, 2024
c33cd72
reformatted cli dir
ramonpzg Aug 21, 2024
caeed09
typos and links fixed
ramonpzg Aug 21, 2024
86773df
typos and links fixed
ramonpzg Aug 21, 2024
8769403
tentative structured added to the root of the docs
ramonpzg Aug 21, 2024
cad0b7e
fixed names in kubernetes section
ramonpzg Aug 24, 2024
7e34f9d
GITBOOK-1: changed hard-coded reference to scheduler.proto
Aug 27, 2024
48707d1
added reference to agent.proto instead of hard-coded version
ramonpzg Aug 27, 2024
05bb232
added reference to chainer.proto instead of hard-coded version
ramonpzg Aug 27, 2024
e4a43d9
removed hard-coded references and added GitHub Gist pointing to v2 br…
ramonpzg Aug 27, 2024
3903c25
fixed format and broken links
ramonpzg Sep 5, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
26 changes: 0 additions & 26 deletions docs/Makefile

This file was deleted.

63 changes: 63 additions & 0 deletions docs/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
# About

Seldon V2 APIs provide a state of the art solution for machine learning inference which
can be run locally on a laptop as well as on Kubernetes for production.

{% embed url="https://www.youtube.com/watch?v=ar5lSG_idh4" %}

## Features

* A single platform for inference of wide range of standard and custom artifacts.
* Deploy locally in Docker during development and testing of models.
* Deploy at scale on Kubernetes for production.
* Deploy single models to multi-step pipelines.
* Save infrastructure costs by deploying multiple models transparently in inference servers.
* Overcommit on resources to deploy more models than available memory.
* Dynamically extended models with pipelines with a data-centric perspective backed by Kafka.
* Explain individual models and pipelines with state of the art explanation techniques.
* Deploy drift and outlier detectors alongside models.
* Kubernetes Service mesh agnostic - use the service mesh of your choice.


## Core features and comparison to Seldon Core V1 APIs

Our V2 APIs separate out core tasks into separate resources allowing users to get started fast
with deploying a Model and the progressing to more complex Pipelines, Explanations and Experiments.

![intro](images/intro.png)

## Multi-model serving

Seldon transparently will provision your model onto the correct inference server.

![mms1](images/multimodel1.png)

By packing multiple models onto a smaller set of servers users can save infrastructure costs and
efficiently utilize their models.

![mms2](images/mms.png)

By allowing over-commit users can provision model models that available memory resources by
allowing Seldon to transparently unload models that are not in use.

![mms3](images/overcommit.png)

## Inference Servers

Seldon V2 supports any V2 protocol inference server. At present we include Seldon's MLServer and NVIDIA's Triton inference server automatically on install. These servers cover a wide range of artifacts including custom python models.

![servers](images/servers.png)

## Service Mesh Agnostic

Seldon core v2 can be integrated with any Kubernetes service mesh. There are current examples with istio, Ambassador and Traefic.

![mesh](images/mesh.png)

## Publication

These features are influenced by our position paper on the next generation of ML model serving frameworks:

*Title*: [Desiderata for next generation of ML model serving](http://arxiv.org/abs/2210.14665)

*Workshop*: Challenges in deploying and monitoring ML systems workshop - NeurIPS 2022
124 changes: 124 additions & 0 deletions docs/SUMMARY.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,124 @@
# Table of contents

* [Home](README.md)
* [Getting Started](getting-started/README.md)
* [Docker Installation](getting-started/docker-installation.md)
* [Kubernetes Installation](getting-started/kubernetes-installation/README.md)
* [Ansible](getting-started/kubernetes-installation/ansible.md)
* [Helm](getting-started/kubernetes-installation/helm.md)
* [Security](getting-started/kubernetes-installation/security/README.md)
* [AWS MSK mTLS](getting-started/kubernetes-installation/security/aws-msk-mtls.md)
* [AWS MSK SASL](getting-started/kubernetes-installation/security/aws-msk-sasl.md)
* [Azure Event Hub SASL Example](getting-started/kubernetes-installation/security/azure-event-hub-sasl.md)
* [Confluent Cloud Oauth 2.0 Example](getting-started/kubernetes-installation/security/confluent-oauth.md)
* [Confluent Cloud SASL Example](getting-started/kubernetes-installation/security/confluent-sasl.md)
* [Strimzi mTLS Example](getting-started/kubernetes-installation/security/strimzi-mtls.md)
* [Strimzi SASL Example](getting-started/kubernetes-installation/security/strimzi-sasl.md)
* [Reference](getting-started/kubernetes-installation/security/reference.md)
* [Configuration](getting-started/configuration.md)
* [Seldon CLI](getting-started/cli.md)
* [APIs](apis/README.md)
* [Internal](apis/internal/README.md)
* [Chainer](apis/internal/chainer.md)
* [Agent](apis/internal/agent.md)
* [Inference](apis/inference/README.md)
* [V2](apis/inference/v2.md)
* [Scheduler](apis/scheduler.md)
* [Architecture](architecture/README.md)
* [DataFlow](architecture/dataflow.md)
* [Examples](examples/README.md)
* [Local examples](examples/local-examples.md)
* [Kubernetes examples](examples/k8s-examples.md)
* [Huggingface models](examples/huggingface.md)
* [Model zoo](examples/model-zoo.md)
* [Artifact versions](examples/multi-version.md)
* [Pipeline examples](examples/pipeline-examples.md)
* [Pipeline to pipeline examples](examples/pipeline-to-pipeline.md)
* [Explainer examples](examples/explainer-examples.md)
* [Custom Servers](examples/custom-servers.md)
* [Local experiments](examples/local-experiments.md)
* [Experiment version examples](examples/experiment-versions.md)
* [Inference examples](examples/inference.md)
* [Tritonclient examples](examples/tritonclient-examples.md)
* [Batch Inference examples (kubernetes)](examples/batch-examples-k8s.md)
* [Batch Inference examples (local)](examples/batch-examples-local.md)
* [Checking Pipeline readiness](examples/pipeline-ready-and-metadata.md)
* [Multi-Namespace Kubernetes](examples/k8s-clusterwide.md)
* [Huggingface speech to sentiment with explanations pipeline](examples/speech-to-sentiment.md)
* [Production image classifier with drift and outlier monitoring](examples/cifar10.md)
* [Production income classifier with drift, outlier and explanations](examples/income.md)
* [Conditional pipeline with pandas query model](examples/pandasquery.md)
* [Kubernetes Server with PVC](examples/k8s-pvc.md)
* [Local Overcommit](examples/k8s-pvc.md)
* [Kubernetes](kubernetes/README.md)
* [Scaling](kubernetes/scaling.md)
* [Autoscaling](kubernetes/autoscaling.md)
* [Tracing](kubernetes/tracing.md)
* [Storage Secrets](kubernetes/storage-secrets.md)
* [Kafka](kubernetes/kafka.md)
* [Metrics](kubernetes/metrics.md)
* [Resources](kubernetes/resources/README.md)
* [Model](kubernetes/resources/model.md)
* [Experiment](kubernetes/resources/experiment.md)
* [Pipeline](kubernetes/resources/pipeline.md)
* [Server](kubernetes/resources/server.md)
* [Server Config](kubernetes/resources/serverconfig.md)
* [Server Runtime](kubernetes/resources/seldonruntime.md)
* [Seldon Config](kubernetes/resources/seldonconfig.md)
* [Service Meshes](kubernetes/service-meshes/README.md)
* [Ambassador](kubernetes/service-meshes/ambassador.md)
* [Istio](kubernetes/service-meshes/istio.md)
* [Traefik](kubernetes/service-meshes/traefik.md)
* [Models](models/README.md)
* [Multi-Model Serving](models/mms.md)
* [Inference Artifacts](models/inference-artifacts.md)
* [rClone](models/rclone.md)
* [Parameterized Models](models/parameterized-models/README.md)
* [Pandas Query](models/parameterized-models/pandasquery.md)
* [Metrics](metrics/README.md)
* [Usage](metrics/usage.md)
* [Operational](metrics/operational.md)
* [Local Metrics](metrics/local-metrics-test.md)
* [Development](development/README.md)
* [License](development/licenses.md)
* [Release](development/release.md)
* [CLI](cli/README.md)
* [Seldon](cli/seldon.md)
* [Config](cli/seldon\_config.md)
* [Config Activate](cli/seldon\_config\_activate.md)
* [Config Deactivate](cli/seldon\_config\_deactivate.md)
* [Config Add](cli/seldon\_config\_add.md)
* [Config List](cli/seldon\_config\_list.md)
* [Config Remove](cli/seldon\_config\_remove.md)
* [Experiment](cli/seldon\_experiment.md)
* [Experiment Start](cli/seldon\_experiment\_start.md)
* [Experiment Status](cli/seldon\_experiment\_status.md)
* [Experiment List](cli/seldon\_experiment\_list.md)
* [Experiment Stop](cli/seldon\_experiment\_stop.md)
* [Model](cli/seldon\_model.md)
* [Model Status](cli/seldon\_model\_status.md)
* [Model Load](cli/seldon\_model\_load.md)
* [Model List](cli/seldon\_model\_list.md)
* [Model Infer](cli/seldon\_model\_infer.md)
* [Model Metadata](cli/seldon\_model\_metadata.md)
* [Model Unload](cli/seldon\_model\_unload.md)
* [Pipeline](cli/seldon\_pipeline.md)
* [Pipeline Load](cli/seldon\_pipeline\_load.md)
* [Pipeline Status](cli/seldon\_pipeline\_status.md)
* [Pipeline List](cli/seldon\_pipeline\_list.md)
* [Pipeline Inspect](cli/seldon\_pipeline\_inspect.md)
* [Pipeline Infer](cli/seldon\_pipeline\_infer.md)
* [Pipeline Unload](cli/seldon\_pipeline\_unload.md)
* [Server](cli/seldon\_server.md)
* [Server List](cli/seldon\_server\_list.md)
* [Server Status](cli/seldon\_server\_status.md)
* [Pipelines](pipelines.md)
* [Experiments](experiments.md)
* [Servers](servers.md)
* [Inference](inference.md)
* [Outlier Detection](outlier.md)
* [Drift Detection](drift.md)
* [Explainers](explainers.md)
* [Performance Tests](performance-tests.md)
* [Upgrading](upgrading.md)
* [FAQ](faqs.md)
7 changes: 7 additions & 0 deletions docs/apis/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# APIs

Seldon provides APIs for management and inference.

* [API for inference](./inference/README.md)
* [Scheduler API for management](./scheduler/README.md) (Advanced)
* [Internal APIs](./internal/README.md) (Reference)
Original file line number Diff line number Diff line change
Expand Up @@ -2,17 +2,6 @@

Seldon inference servers must respect the following API specification.

* [Seldon, KServe, NVIDIA V2 Inference API Spec](./v2.md)
* [Seldon, KServe, NVIDIA V2 Inference API Spec](./v2.md)

In future, Seldon may provide extensions for use with Pipelines, Experiments and Explainers.

```{toctree}
:maxdepth: 1
:hidden:

v2.md
```




Loading
Loading