Skip to content

[xDS] NACK/ACK should be always reported even if subscribedResourceTypeUrls cannot be found  #11232

@lujiajing1126

Description

@lujiajing1126

What version of gRPC-Java are you using?

grpc-xds 1.56.1, but the recent version (1.64.0) also suffers from this issue.

What is your environment?

Linux/Kubernetes with Istio 1.16.7 installed.

What did you expect to see?

If a new CDS is pushed down to the client and all previous EDS subscription is revoked, it will remove the subscribedResourceTypeUrls[EDS] entry totally,

if (resourceSubscribers.get(type).isEmpty()) {
resourceSubscribers.remove(type);
subscribedResourceTypeUrls.remove(type.typeUrl());
}

and neither ACK/NACK will be responded to the XdsServer in the subsequent EDS response,

if (type == null) {
logger.log(
XdsLogLevel.WARNING,
"Ignore an unknown type of DiscoveryResponse: {0}",
response.getTypeUrl());
call.startRecvMessage();
return;
}

Then if we add EDS subscription again according to the new CDS, an old nonce will be used. This will lead to issues.

A possible workaround is to comment the removal of subscribedResourceTypeUrls. Quoted from the official docs,

  • The xDS client should ACK or NACK every DiscoveryResponse received from the management server. The response_nonce field tells the server which of its responses the ACK or NACK is associated with.

I would like to raise a PR to resolve this issue.

What did you see instead?

The following is the log from Istiod,

// Users apply VirtualService and DestinationRule update via kubectl
2024-05-24T07:06:53.925724Z	debug	Handle event update for configuration networking.istio.io/v1alpha3/VirtualService/default/e2e-service-provider
2024-05-24T07:06:53.934328Z	debug	Handle event update for configuration networking.istio.io/v1alpha3/DestinationRule/default/e2e-service-provider
2024-05-24T07:06:54.024710Z	debug	Handle event update for configuration networking.istio.io/v1alpha3/DestinationRule/default/e2e-service-provider
2024-05-24T07:06:54.126452Z	info	ads	Push debounce stable[17] 7 for config DestinationRule/default/e2e-service-provider and 3 more configs: 101.655214ms since last change, 235.324446ms since last push, full=true
2024-05-24T07:06:54.126868Z	debug	gateway	reconcile complete in 8.042µs
2024-05-24T07:06:54.127123Z	debug	ads	InitContext 2024-05-24T07:06:54Z/7 for push took 588.958µs
2024-05-24T07:06:54.127140Z	info	ads	XDS: Pushing:2024-05-24T07:06:54Z/7 Services:6 ConnectedEndpoints:1 Version:2024-05-24T07:06:54Z/7
// Istiod pushes CDS update in which all EDS will be unsubscribed
2024-05-24T07:06:54.127407Z	info	ads	CDS: PUSH for node:e2e-service-consumer-base-5dd4cb9fbf-xqc6v.default resources:0 size:0B nonce:95973a27-8c59-4e82-897c-3929efb80890 version:2024-05-24T07:06:54Z/7
// Istiod pushes EDS update but neither ACK nor NACK is received
2024-05-24T07:06:54.127548Z	info	ads	EDS: PUSH for node:e2e-service-consumer-base-5dd4cb9fbf-xqc6v.default resources:2 size:395B empty:0 cached:0/2 nonce:308676e2-26dc-412d-bcb1-8958087d4138 version:2024-05-24T07:06:54Z/7
2024-05-24T07:06:54.127602Z	debug	grpcgen	building lds for e2e-service-consumer-base-5dd4cb9fbf-xqc6v.default with filter:
map[e2e-service-provider.default.svc.cluster.local:{map[e2e-service-provider.default.svc.cluster.local:{}] map[80:{}]}]
2024-05-24T07:06:54.127667Z	info	ads	LDS: PUSH for node:e2e-service-consumer-base-5dd4cb9fbf-xqc6v.default resources:1 size:450B nonce:33198869-77f9-4f1e-a7d5-b43f78e76242 version:2024-05-24T07:06:54Z/7
2024-05-24T07:06:54.128177Z	info	ads	RDS: PUSH for node:e2e-service-consumer-base-5dd4cb9fbf-xqc6v.default resources:1 size:4.8kB nonce:93bec8ee-fe97-4ca0-a242-5dbdc182a484 version:2024-05-24T07:06:54Z/7
2024-05-24T07:06:54.131841Z	debug	ads	ADS:CDS: REQ e2e-service-consumer-base-5dd4cb9fbf-xqc6v.default-1 resources:2 nonce:95973a27-8c59-4e82-897c-3929efb80890 version:2024-05-24T07:06:54Z/7 
2024-05-24T07:06:54.131937Z	debug	ads	ADS:CDS: ACK e2e-service-consumer-base-5dd4cb9fbf-xqc6v.default-1 2024-05-24T07:06:54Z/7 95973a27-8c59-4e82-897c-3929efb80890
// Subscription is adjusted from the client-side
2024-05-24T07:06:54.136024Z	debug	ads	ADS:EDS: REQ e2e-service-consumer-base-5dd4cb9fbf-xqc6v.default-1 resources:1 nonce:94c7313f-6355-4f09-aae2-7eb2de165491 version:2024-05-24T07:06:33Z/6
// But nonce is expired
2024-05-24T07:06:54.136057Z	debug	ads	ADS:EDS: REQ e2e-service-consumer-base-5dd4cb9fbf-xqc6v.default-1 Expired nonce received 94c7313f-6355-4f09-aae2-7eb2de165491, sent 308676e2-26dc-412d-bcb1-8958087d4138
2024-05-24T07:06:54.141001Z	debug	ads	ADS:LDS: REQ e2e-service-consumer-base-5dd4cb9fbf-xqc6v.default-1 resources:1 nonce:33198869-77f9-4f1e-a7d5-b43f78e76242 version:2024-05-24T07:06:54Z/7 
2024-05-24T07:06:54.141036Z	debug	ads	ADS:LDS: ACK e2e-service-consumer-base-5dd4cb9fbf-xqc6v.default-1 2024-05-24T07:06:54Z/7 33198869-77f9-4f1e-a7d5-b43f78e76242
2024-05-24T07:06:54.143833Z	debug	ads	ADS:RDS: REQ e2e-service-consumer-base-5dd4cb9fbf-xqc6v.default-1 resources:1 nonce:93bec8ee-fe97-4ca0-a242-5dbdc182a484 version:2024-05-24T07:06:54Z/7 
2024-05-24T07:06:54.143903Z	debug	ads	ADS:RDS: ACK e2e-service-consumer-base-5dd4cb9fbf-xqc6v.default-1 2024-05-24T07:06:54Z/7 93bec8ee-fe97-4ca0-a242-5dbdc182a484

Steps to reproduce the bug

First, use subset in the VirtualService,

apiVersion: v1
kind: Service
metadata:
  name: e2e-service-provider
spec:
  selector:
    app: e2e-service-provider
    group: ft
  ports:
  - name: http-80
    port: 80
    targetPort: 8080
---
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: e2e-service-provider
spec:
  gateways:
    - istio-system/ingressgateway
    - mesh
  hosts:
    - e2e-service-provider
  http:
    - match:
        - headers:
            application:
                exact: e2e-service-consumer
            x-env-flag:
                exact: blue
      route:
        - destination:
            host: e2e-service-provider
            subset: e2e-service-provider-red
          weight: 100
    - match:
        - headers:
            x-env-flag:
                exact: yellow
          uri:
            exact: /rpc/fetchTag
      route:
        - destination:
            host: e2e-service-provider
            subset: e2e-service-provider-red
          weight: 100
      timeout: 60s
    - headers:
        request:
          set:
            x-env-flag: red
      match:
        - headers:
            x-env-flag:
              exact: red
        - queryParams:
            x-env-flag:
              exact: red
        - sourceLabels:
            version: red
      route:
        - destination:
            host: e2e-service-provider
            subset: e2e-service-provider-red
      timeout: 60s
    - route:
        - destination:
            host: e2e-service-provider
            subset: e2e-service-provider-base
      timeout: 60s
---
apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
  name: e2e-service-provider
spec:
  host: e2e-service-provider
  subsets:
    - labels:
        version: base
      name: e2e-service-provider-base
      trafficPolicy:
        loadBalancer:
          warmupDurationSecs: 60s
    - labels:
        version: red
      name: e2e-service-provider-red

Then apply a new yaml to totally remove subset usages,

# Step1: add services 
apiVersion: v1
kind: Service
metadata:
  name: e2e-service-provider
spec:
  selector:
    app: e2e-service-provider
    group: ft
    version: base
  ports:
  - name: http-80
    port: 80
    targetPort: 8080
---
apiVersion: v1
kind: Service
metadata:
  name: e2e-service-provider-canary
spec:
  selector:
    app: e2e-service-provider
    group: ft
    version: red
  ports:
  - name: http-80
    port: 80
    targetPort: 8080
---
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: e2e-service-provider
spec:
  gateways:
    - istio-system/ingressgateway
    - mesh
  hosts:
    - e2e-service-provider
  http:
    - match:
        - headers:
            application:
                exact: e2e-service-consumer
            x-env-flag:
                exact: blue
      route:
        - destination:
            host: e2e-service-provider-canary
          weight: 100
    - match:
        - headers:
            x-env-flag:
                exact: yellow
          uri:
            exact: /rpc/fetchTag
      route:
        - destination:
            host: e2e-service-provider-canary
          weight: 100
      timeout: 60s
    - headers:
        request:
          set:
            x-env-flag: red
      match:
        - headers:
            x-env-flag:
              exact: red
        - queryParams:
            x-env-flag:
              exact: red
        - sourceLabels:
            version: red
      route:
        - destination:
            host: e2e-service-provider-canary
      timeout: 60s
    - route:
        - destination:
            host: e2e-service-provider
      timeout: 60s
---
apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
  name: e2e-service-provider
spec:
  host: e2e-service-provider
  trafficPolicy:
    loadBalancer:
      warmupDurationSecs: 60s
  subsets: []

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions