Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tls: failed to verify certificate: x509: certificate signed by unknown authority" #1095

Open
ShrishtiKarkera opened this issue Sep 30, 2024 · 2 comments
Labels
bug Something isn't working

Comments

@ShrishtiKarkera
Copy link

Bug Description

I'm unable to use kserve inferenceservice using the JupyterLab notebook, when I create an inference client, it throws this error:
"inferenceservice.kserve-webhook-server.defaulter": failed to call webhook: Post "https://kserve-webhook-server-service.kubeflow.svc:443/mutate-serving-kserve-io-v1beta1-inferenceservice?timeout=10s\": tls: failed to verify certificate: x509: certificate signed by unknown authority"

Inference service client looks like this and my model is stored in minio:

from datetime import datetime
from kserve import KServeClient, constants
from kserve.models import (
    V1beta1InferenceService,
    V1beta1InferenceServiceSpec,
    V1beta1PredictorSpec,
    V1beta1SKLearnSpec
)
from kubernetes import client
import utils

# Get the default target namespace
namespace = "admin"

now = datetime.now()
v = now.strftime("%Y-%m-%d--%H-%M-%S")

name = 'iris-classifier'
kserve_version = 'v1beta1'
api_version = constants.KSERVE_GROUP + '/' + kserve_version

# Create the InferenceService
isvc = V1beta1InferenceService(
    api_version=api_version,
    kind=constants.KSERVE_KIND,
    metadata=client.V1ObjectMeta(
        name=name, 
        namespace=namespace, 
        annotations={'sidecar.istio.io/inject': 'false'}
    ),
    spec=V1beta1InferenceServiceSpec(
        predictor=V1beta1PredictorSpec(
            service_account_name="sa-minio-kserve",
            sklearn=V1beta1SKLearnSpec(
                storage_uri="s3://mlpipeline/models/iris_model.pkl"
            )
        )
    )
)

# Create the InferenceService in KServe
KServe = KServeClient()
KServe.create(isvc)

I checked the certs and found everything to be in place, I also tried restarting the mutatingwebhookconfiguration but didn't help.

To Reproduce

  1. Deploy Charmed Kubeflow - https://charmed-kubeflow.io/docs/get-started-with-charmed-kubeflow
  2. Allow minio access - https://charmed-kubeflow.io/docs/allow-access-minio
  3. Allow Kserve to access minio
  4. Launch a new notebook (scipy image)
  5. Execute the following code
    Note: I have the model in minio bucket: mlpipeline (upload the model.pkl file)
pip install minio boto3 mlflow
import pandas as pd
import os
from sklearn import datasets
from minio import Minio
# Load dataset
iris = datasets.load_iris()
df = pd.DataFrame(iris.data, columns=iris.feature_names)
df['species'] = iris.target

df = df.dropna()
from sklearn.model_selection import train_test_split
import pandas as pd
import numpy as np
import os
target_column = 'species'
X = df.loc[:, df.columns != target_column]
y = df.loc[:, df.columns == target_column]

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3,stratify = y, random_state=47)
from sklearn.linear_model import LogisticRegression
import joblib
iris_model = LogisticRegression(max_iter=200)
iris_model.fit(X_train,y_train)
joblib.dump(iris_model, 'iris_model.pkl')
from datetime import datetime
from kserve import KServeClient, constants
from kserve.models import (
    V1beta1InferenceService,
    V1beta1InferenceServiceSpec,
    V1beta1PredictorSpec,
    V1beta1SKLearnSpec
)
from kubernetes import client
import utils

# Get the default target namespace
namespace = "admin"

now = datetime.now()
v = now.strftime("%Y-%m-%d--%H-%M-%S")

name = 'iris-classifier'
kserve_version = 'v1beta1'
api_version = constants.KSERVE_GROUP + '/' + kserve_version

# Create the InferenceService
isvc = V1beta1InferenceService(
    api_version=api_version,
    kind=constants.KSERVE_KIND,
    metadata=client.V1ObjectMeta(
        name=name, 
        namespace=namespace, 
        annotations={'sidecar.istio.io/inject': 'false'}
    ),
    spec=V1beta1InferenceServiceSpec(
        predictor=V1beta1PredictorSpec(
            service_account_name="sa-minio-kserve",
            sklearn=V1beta1SKLearnSpec(
                storage_uri="s3://mlpipeline/models/iris_model.pkl"
            )
        )
    )
)

# Create the InferenceService in KServe
KServe = KServeClient()
KServe.create(isvc)

Environment

AWS t3x2 large instance with 10 gbs of storage
Installed Charmed Kubeflow, minio and mlflow
Allowed minio access and mlflow access

Relevant Log Output

Post \"https://kserve-webhook-server-service.kubeflow.svc:443/mutate-serving-kserve-io-v1beta1-inferenceservice?timeout=10s\": tls: failed to verify certificate: x509: certificate signed by unknown authority","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:329\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:266\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:227"}
{"level":"error","ts":"2024-09-30T20:50:29Z","msg":"Reconciler error","controller":"inferenceservice","controllerGroup":"serving.kserve.io","controllerKind":"InferenceService","InferenceService":{"name":"iris-classifier","namespace":"admin"},"namespace":"admin","name":"iris-classifier","reconcileID":"57908d65-5f49-4305-ade5-3247160b89ec","error":"Internal error occurred: failed calling webhook \"inferenceservice.kserve-webhook-server.defaulter\": failed to call webhook: Post \"https://kserve-webhook-server-service.kubeflow.svc:443/mutate-serving-kserve-io-v1beta1-inferenceservice?timeout=10s\": tls: failed to verify certificate: x509: certificate signed by unknown authority","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:329\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:266\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:227"}

Additional Context

No response

@ShrishtiKarkera ShrishtiKarkera added the bug Something isn't working label Sep 30, 2024
Copy link

Thank you for reporting us your feedback!

The internal ticket has been created: https://warthogs.atlassian.net/browse/KF-6340.

This message was autogenerated

@NohaIhab
Copy link
Contributor

NohaIhab commented Oct 9, 2024

Hi @ShrishtiKarkera
From the logs it looks like there's an issue with verifying certificate the webhook of KServe in the MutatingWebhookConfiguration object.
To debug this further, first we need to check the health of the admission webhook charm and workload.
Can you share:

  1. The logs of admission webhook charm by running:
juju debug-log --replay --include unit-admission-webhook-0
  1. The logs of admission webhook workload by running:
kubectl logs -n kubeflow admission-webhook-0 -c admission-webhook

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants