Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dragonfly is not working with Istio (injected sidecars) #350

Open
vkoshkarovroku opened this issue Jan 17, 2025 · 1 comment
Open

Dragonfly is not working with Istio (injected sidecars) #350

vkoshkarovroku opened this issue Jan 17, 2025 · 1 comment
Labels
bug Something isn't working

Comments

@vkoshkarovroku
Copy link

Bug report:

Dragonfly is not working with Istio (sidecar injection). The scheduler fails to start as it seems like it can't connect to other components with gRPC calls. This holds all other components from running.

Image

dragonfly-scheduler-0 logs:

scheduler 2025-01-16T20:44:49.384Z    INFO    cmd/root.go:123    version:
scheduler Major: 2, Minor: 0, GitVersion: v2.2.0, GitCommit: cc4abed, Platform: linux, BuildTime: 2024-12-31T04:21:16Z, GoVersion: go1.23.0 linux/arm64, Gotags: none, Gogcflags: none
scheduler 2025-01-16T20:44:49.384Z    INFO    grpclog/grpclog.go:141    [core]original dial target is: "dragonfly-manager.dragonfly-system.svc.cluster.local:65003"
scheduler 2025-01-16T20:44:49.384Z    INFO    grpclog/grpclog.go:141    [core][Channel #1]Channel created
scheduler 2025-01-16T20:44:49.384Z    INFO    grpclog/grpclog.go:141    [core][Channel #1]parsed dial target is: resolver.Target{URL:url.URL{Scheme:"passthrough", Opaque:"", User:(*url.Userinfo)(nil), Host:"", Path:"/dragonfly-manager.dragonfly-system.svc.cluster.local:65003", RawPath:"", OmitHost:false, ForceQuery:false, RawQuery:"", Fragment:"", RawFragment:""}}
scheduler 2025-01-16T20:44:49.384Z    INFO    grpclog/grpclog.go:141    [core][Channel #1]Channel authority set to "dragonfly-manager.dragonfly-system.svc.cluster.local:65003"
scheduler 2025-01-16T20:44:49.384Z    INFO    grpclog/grpclog.go:141    [core][Channel #1]Resolver state updated: {
scheduler   "Addresses": [
scheduler     {
scheduler       "Addr": "dragonfly-manager.dragonfly-system.svc.cluster.local:65003",
scheduler       "ServerName": "",
scheduler       "Attributes": null,
scheduler       "BalancerAttributes": null,
scheduler       "Metadata": null
scheduler     }
scheduler   ],
scheduler   "Endpoints": [
scheduler     {
scheduler       "Addresses": [
scheduler         {
scheduler           "Addr": "dragonfly-manager.dragonfly-system.svc.cluster.local:65003",
scheduler           "ServerName": "",
scheduler           "Attributes": null,
scheduler           "BalancerAttributes": null,
scheduler           "Metadata": null
scheduler         }
scheduler       ],
scheduler       "Attributes": null
scheduler     }
scheduler   ],
scheduler   "ServiceConfig": null,
scheduler   "Attributes": null
scheduler } (resolver returned new addresses)
scheduler 2025-01-16T20:44:49.384Z    INFO    grpclog/grpclog.go:141    [core][Channel #1]Channel switches to new LB policy "pick_first"
scheduler 2025-01-16T20:44:49.384Z    INFO    grpclog/grpclog.go:141    [core][Channel #1 SubChannel #2]Subchannel created
scheduler 2025-01-16T20:44:49.384Z    INFO    grpclog/grpclog.go:141    [core][Channel #1]Channel Connectivity change to CONNECTING
scheduler 2025-01-16T20:44:49.384Z    INFO    grpclog/grpclog.go:141    [core][Channel #1]Channel exiting idle mode
scheduler 2025-01-16T20:44:49.385Z    INFO    grpclog/grpclog.go:141    [core][Channel #1 SubChannel #2]Subchannel Connectivity change to CONNECTING
scheduler 2025-01-16T20:44:49.385Z    INFO    grpclog/grpclog.go:141    [core][Channel #1 SubChannel #2]Subchannel picks a new address "dragonfly-manager.dragonfly-system.svc.cluster.local:65003" to connect
scheduler 2025-01-16T20:44:49.387Z    INFO    grpclog/grpclog.go:141    [core][Channel #1 SubChannel #2]Subchannel Connectivity change to READY
scheduler 2025-01-16T20:44:49.387Z    INFO    grpclog/grpclog.go:141    [core][Channel #1]Channel Connectivity change to READY
scheduler 2025-01-16T20:44:50.550Z    WARN    zap/client_interceptors.go:52    finished client unary call    {"system": "grpc", "span.kind": "client", "grpc.service": "manager.v2.Manager", "grpc.method": "UpdateScheduler", "error": "rpc error: code = Unavailable desc = upstream connect error or disconnect/reset before headers. reset reason: protocol error", "grpc.code": "Unavailable", "grpc.time_ms": 1161.175}
scheduler github.com/grpc-ecosystem/go-grpc-middleware/logging/zap.logFinalClientLine
scheduler     /go/pkg/mod/github.com/grpc-ecosystem/[email protected]/logging/zap/client_interceptors.go:52
scheduler github.com/grpc-ecosystem/go-grpc-middleware/logging/zap.UnaryClientInterceptor.func1
scheduler     /go/pkg/mod/github.com/grpc-ecosystem/[email protected]/logging/zap/client_interceptors.go:30
scheduler d7y.io/dragonfly/v2/pkg/rpc/manager/client.GetV2ByAddr.ChainUnaryClient.func7.1
scheduler     /go/pkg/mod/github.com/grpc-ecosystem/[email protected]/chain.go:113
scheduler github.com/grpc-ecosystem/go-grpc-prometheus.init.(*ClientMetrics).UnaryClientInterceptor.func1
scheduler     /go/pkg/mod/github.com/grpc-ecosystem/[email protected]/client_metrics.go:112
scheduler d7y.io/dragonfly/v2/pkg/rpc/manager/client.GetV2ByAddr.ChainUnaryClient.func7
scheduler     /go/pkg/mod/github.com/grpc-ecosystem/[email protected]/chain.go:116
scheduler google.golang.org/grpc.(*ClientConn).Invoke
scheduler     /go/pkg/mod/google.golang.org/[email protected]/call.go:35
scheduler d7y.io/api/v2/pkg/apis/manager/v2.(*managerClient).UpdateScheduler
scheduler     /go/pkg/mod/d7y.io/api/[email protected]/pkg/apis/manager/v2/manager_grpc.pb.go:101
scheduler d7y.io/dragonfly/v2/pkg/rpc/manager/client.(*v2).UpdateScheduler
scheduler     /go/src/d7y.io/dragonfly/v2/pkg/rpc/manager/client/client_v2.go:164
scheduler d7y.io/dragonfly/v2/scheduler/announcer.New
scheduler     /go/src/d7y.io/dragonfly/v2/scheduler/announcer/announcer.go:63
scheduler d7y.io/dragonfly/v2/scheduler.New
scheduler     /go/src/d7y.io/dragonfly/v2/scheduler/scheduler.go:136
scheduler d7y.io/dragonfly/v2/cmd/scheduler/cmd.runScheduler
scheduler     /go/src/d7y.io/dragonfly/v2/cmd/scheduler/cmd/root.go:128
scheduler d7y.io/dragonfly/v2/cmd/scheduler/cmd.init.func1
scheduler     /go/src/d7y.io/dragonfly/v2/cmd/scheduler/cmd/root.go:80
scheduler github.com/spf13/cobra.(*Command).execute
scheduler     /go/pkg/mod/github.com/spf13/[email protected]/command.go:985
scheduler github.com/spf13/cobra.(*Command).ExecuteC
scheduler     /go/pkg/mod/github.com/spf13/[email protected]/command.go:1117
scheduler github.com/spf13/cobra.(*Command).Execute
scheduler     /go/pkg/mod/github.com/spf13/[email protected]/command.go:1041
scheduler d7y.io/dragonfly/v2/cmd/scheduler/cmd.Execute
scheduler     /go/src/d7y.io/dragonfly/v2/cmd/scheduler/cmd/root.go:87
scheduler main.main
scheduler     /go/src/d7y.io/dragonfly/v2/cmd/scheduler/main.go:24
scheduler runtime.main
scheduler     /usr/local/go/src/runtime/proc.go:272
scheduler 2025-01-16T20:44:50.550Z    INFO    dependency/dependency.go:128    do 1 monitor finalizer
scheduler Error: rpc error: code = Unavailable desc = upstream connect error or disconnect/reset before headers. reset reason: protocol error
scheduler 2025-01-16T20:44:50.550Z    ERROR    cmd/root.go:88    rpc error: code = Unavailable desc = upstream connect error or disconnect/reset before headers. reset reason: protocol error
scheduler d7y.io/dragonfly/v2/cmd/scheduler/cmd.Execute
scheduler     /go/src/d7y.io/dragonfly/v2/cmd/scheduler/cmd/root.go:88
scheduler main.main
scheduler     /go/src/d7y.io/dragonfly/v2/cmd/scheduler/main.go:24
scheduler runtime.main
scheduler     /usr/local/go/src/runtime/proc.go:272
stream closed EOF for dragonfly-system/dragonfly-scheduler-0 (scheduler)

Expected behavior:

Dragonfly works with Istio injected sidecars as expected and functions as container registry.

How to reproduce it:

  1. Follow the instruction to setup Kind cluster: https://d7y.io/docs/next/getting-started/installation/helm-charts/#setup-kubernetes-cluster. However, it is reproducible on any Kubernetes cluster.
  2. Set your context: CONTEXT=<your kubernetes context>
  3. Install Istio:
helm install istio-base istio/base -n istio-system --set defaultRevision=default --create-namespace --kube-context=$CONTEXT
helm install istiod istio/istiod -n istio-system \
  --set global.proxy.autoInject=enabled \
  --set global.defaultPodDisruptionBudget.enabled=true \
  --set meshConfig.defaultConfig.proxyMetadata.DNS_CAPTURE=true \
  --set meshConfig.accessLogFile="/dev/stdout" \
  --set meshConfig.enableAutoMtls=true
  1. Create the dragonfly-system namespace and label it with istio-injection=enabled to automatically inject Istio sidecars.
kubectl create namespace dragonfly-system --context=$CONTEXT
kubectl label namespace dragonfly-system istio-injection=enabled --context=$CONTEXT
  1. Install Dragonfly: https://d7y.io/docs/next/getting-started/installation/helm-charts/#create-dragonfly-cluster-based-on-helm-charts
    Alternatively, install Dragonfly with the following commands:
helm repo add dragonfly https://dragonflyoss.github.io/helm-charts/
helm upgrade --install --wait --namespace dragonfly-system dragonfly dragonfly/dragonfly --version 1.3.5 --kube-context $CONTEXT

Environment:

  • Dragonfly version: 1.3.5
  • OS: Kind Kubernetes
  • Kernel (e.g. uname -a):
  • Others:
@vkoshkarovroku vkoshkarovroku added the bug Something isn't working label Jan 17, 2025
@vkoshkarovroku
Copy link
Author

vkoshkarovroku commented Jan 17, 2025

Istio selects wrong protocol. We can fix that by adding appProtocol: grpc to the Dragonfly services. More info here: https://istio.io/latest/docs/ops/configuration/traffic-management/protocol-selection/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant