Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
{
"inputs": [
{
"name": "predict-prob",
"shape": [
1,
10
],
"datatype": "FP32",
"data": [
[
61,
56,
27,
59,
15,
50,
60,
77,
62,
98
]
]
}
]
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,180 @@
---
title: CatBoost
description: Deploy CatBoost models with KServe
---

# Deploying CatBoost Models with KServe

This guide demonstrates how to deploy MLflow models using KServe's `InferenceService` and how to send inference requests using the [Open Inference Protocol](https://github.com/kserve/open-inference-protocol).

## Prerequisites

Before you begin, make sure you have:

- A Kubernetes cluster with [KServe installed](../../../../getting-started/quickstart-guide.md).
- `kubectl` CLI configured to communicate with your cluster.
- Basic knowledge of Kubernetes concepts and MLflow models.
- Access to cloud storage (like Google Cloud Storage) to store your model artifacts.

## Training a Sample CatBoost Model

The first step is to train a sample CatBoost model and serialize it in an appropriate format using [save_model](https://catboost.ai/docs/en/concepts/python-reference_catboost_save_model) API.

```python
import numpy as np
from catboost import CatBoostClassifier

train_data = np.random.randint(0, 100, size=(100, 10))
train_labels = np.random.randint(0, 2, size=(100))

model = CatBoostClassifier(
iterations=2,
depth=2,
learning_rate=1,
loss_function="Logloss",
verbose=True,
)
model.fit(train_data, train_labels)
model.save_model("model.cbm")
```

## Testing the Model Locally

Once you have your model serialized as `model.cbm`, you can use [MLServer](https://github.com/SeldonIO/MLServer) to create a local model server. For more details, check the [CatBoost example documentation](https://mlserver.readthedocs.io/en/stable/examples/catboost/README.html).

:::tip
This local testing step is optional. You can skip to the deployment section if you prefer.
:::

### Prerequisites

To use MLServer locally, install the `mlserver` package and the CatBoost runtime:

```bash
pip install mlserver mlserver-catboost
```

### Model Settings

Next, provide model settings so that MLServer knows:

- The inference runtime to serve your model (i.e. `mlserver_catboost.CatboostModel`)
- The model's parameters

These can be specified through environment variables or by creating a local `model-settings.json` file:

```json
{
"implementation": "mlserver_catboost.CatboostModel",
"name": "catboost-classifier",
"parameters": {
"uri": "./model.cbm",
"version": "v0.1.0"
}
}
```

### Starting the Model Server Locally

With the `mlserver` package installed and a local `model-settings.json` file, start your server with:

```bash
cat << EOF > model-settings.json
{
"implementation": "mlserver_catboost.CatboostModel",
"name": "catboost-classifier",
"parameters": {
"uri": "./model.cbm",
"version": "v0.1.0"
}
}
EOF

mlserver start .
```

If everything is fine, you will see such messages in the mlserver process output:

```txt
2025-09-18 01:15:05,483 [mlserver.parallel] DEBUG - Starting response processing loop...
2025-09-18 01:15:05,484 [mlserver.rest] INFO - HTTP server running on http://0.0.0.0:8080
...
2025-09-18 01:15:06,528 [mlserver][catboost-classifier:v0.1.0] INFO - Loaded model 'catboost-classifier' successfully.
2025-09-18 01:15:06,529 [mlserver][catboost-classifier:v0.1.0] INFO - Loaded model 'catboost-classifier' successfully.
```

## Deploying the Model with InferenceService

When deploying the model with InferenceService, KServe injects sensible defaults that work out-of-the-box without additional configuration. However, you can override these defaults by providing a `model-settings.json` file similar to your local one. You can even provide [multiple `model-settings.json` files to load multiple models](https://github.com/SeldonIO/MLServer/tree/master/docs/examples/mms).

To use the Open Inference Protocol (v2) for inference with the deployed model, set the `protocolVersion` field to `v2`. In this example, your model artifacts have already been uploaded to a Google Cloud Storage bucket and can be accessed at `gs://kfserving-examples/models/catboost/classifier`.

Apply the YAML manifest:

```bash
kubectl apply -f - << EOF
apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
name: "catboost-example"
spec:
predictor:
model:
runtime: kserve-mlserver
modelFormat:
name: catboost
protocolVersion: v2
storageUri: "gs://kfserving-examples/models/catboost/classifier"
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@spolti, would it be possible to upload this file to the official KServe bucket?

Is's just a single .cbm file

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good question, @yuzisun wdyt?

resources:
requests:
cpu: "1"
memory: "1Gi"
limits:
cpu: "1"
memory: "1Gi"
EOF
```

## Testing the Deployed Model

You can test your deployed model by sending a sample request following the [Open Inference Protocol](https://github.com/kserve/open-inference-protocol).

Use our sample input file [catboost-input.json](./catboost-input.json) to test the model:

[Determine the ingress IP and ports](../../../../getting-started/predictive-first-isvc.md#4-determine-the-ingress-ip-and-ports), then set the `INGRESS_HOST` and `INGRESS_PORT` environment variables.

```bash
SERVICE_HOSTNAME=$(kubectl get inferenceservice catboost-example -o jsonpath='{.status.url}' | cut -d "/" -f 3)

curl -v \
-H "Host: ${SERVICE_HOSTNAME}" \
-H "Content-Type: application/json" \
-d @./catboost-input.json \
http://${INGRESS_HOST}:${INGRESS_PORT}/v2/models/catboost-example/infer
```

:::tip[Expected Output]
```json
{
"model_name": "catboost-example",
"id": "70062817-f7de-4105-93ef-e0e2ea3d214e",
"parameters": {},
"outputs": [
{
"name": "predict",
"shape": [
1,
1
],
"datatype": "INT64",
"parameters": {
"content_type": "np"
},
"data": [
0
]
}
]
}
```
:::
1 change: 1 addition & 0 deletions sidebars.ts
Original file line number Diff line number Diff line change
Expand Up @@ -125,6 +125,7 @@ const sidebars: SidebarsConfig = {
"model-serving/predictive-inference/frameworks/lightgbm/lightgbm",
"model-serving/predictive-inference/frameworks/paddle/paddle",
"model-serving/predictive-inference/frameworks/mlflow/mlflow",
"model-serving/predictive-inference/frameworks/catboost/catboost",
"model-serving/predictive-inference/frameworks/onnx/onnx",
{
type: 'category',
Expand Down