Skip to content

e2es fail when looking for a deleted resource #5632

@nrb

Description

@nrb

/kind bug

What steps did you take and what happened:

CI jobs are intermittently failing, usually on capa-e2e.[It] [unmanaged] [functional] Workload cluster with AWS S3 and Ignition parameter It should be creatable and deletable, but not always.

The failures look like this in context:

   STEP: Waiting for the workload nodes to exist @ 08/27/25 08:29:33.357
  INFO: clusterctl describe cluster functional-test-ignition-3x6cjb --show-conditions=all --show-machinesets=true --grouping=false --echo=true --v1beta2

# looped clusterctl describe commands omitted

  STEP: Finding EC2 instance with ID: aws:///us-west-2a/i-0f939de53fc0aee52 @ 08/27/25 08:30:14.046
  STEP: Validating the s3 endpoint was created @ 08/27/25 08:30:14.548
  STEP: Deleting the cluster @ 08/27/25 08:30:15.135
  STEP: Deleting cluster functional-test-ignition-718zhm/functional-test-ignition-3x6cjb @ 08/27/25 08:30:15.136
  STEP: Waiting for cluster functional-test-ignition-718zhm/functional-test-ignition-3x6cjb to be deleted @ 08/27/25 08:30:15.156
  INFO: clusterctl describe cluster functional-test-ignition-3x6cjb --show-conditions=all --show-machinesets=true --grouping=false --echo=true --v1beta2
  INFO: clusterctl describe cluster functional-test-ignition-3x6cjb --show-conditions=all --show-machinesets=true --grouping=false --echo=true --v1beta2

  INFO: clusterctl describe cluster functional-test-ignition-3x6cjb --show-conditions=all --show-machinesets=true --grouping=false --echo=true --v1beta2


# looped clusterctl describe commands omitted

  STEP: Waiting for AWSCluster to show the VPC endpoint as deleted in conditions @ 08/27/25 08:35:25.68
  [FAILED] in [It] - /home/prow/go/pkg/mod/sigs.k8s.io/cluster-api/[email protected]/framework/cluster_helpers.go:373 @ 08/27/25 08:35:25.69
  << Timeline
  [FAILED] Failed to run clusterctl describe
  Unexpected error:
      <*errors.StatusError | 0xc000791860>: 
      clusters.cluster.x-k8s.io "functional-test-ignition-3x6cjb" not found
      {
          ErrStatus: {
              TypeMeta: {Kind: "", APIVersion: ""},
              ListMeta: {
                  SelfLink: "",
                  ResourceVersion: "",
                  Continue: "",
                  RemainingItemCount: nil,
              },
              Status: "Failure",
              Message: "clusters.cluster.x-k8s.io \"functional-test-ignition-3x6cjb\" not found",
              Reason: "NotFound",
              Details: {
                  Name: "functional-test-ignition-3x6cjb",
                  Group: "cluster.x-k8s.io",
                  Kind: "clusters",
                  UID: "",
                  Causes: nil,
                  RetryAfterSeconds: 0,
              },
              Code: 404,
          },
      }
  occurred
  In [It] at: /home/prow/go/pkg/mod/sigs.k8s.io/cluster-api/[email protected]/framework/cluster_helpers.go:373 @ 08/27/25 08:35:25.69

What did you expect to happen:

Tests to not fail on a 404 when looking for deletion.

Anything else you would like to add:

This seems to be some sort of race condition between the controllers cleaning up a Cluster resource and the tests looking for that resource while dependencies are being deleted.

Environment:

  • Cluster-api-provider-aws version: f563206 and later

Metadata

Metadata

Assignees

No one assigned

    Labels

    area/deflakeIssues or PRs related to deflaking Cluster API testskind/bugCategorizes issue or PR as related to a bug.priority/important-soonMust be staffed and worked on either currently, or very soon, ideally in time for the next release.triage/acceptedIndicates an issue or PR is ready to be actively worked on.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions