-
Notifications
You must be signed in to change notification settings - Fork 17
Bugs: wrong address and memory leak #285
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
* We create several etcd clients in the reconciliation loop, but never had a defer .Close() for them. * We erroneously try to configure these clients with etcd-0 instead of etcd-0.etcd-headless.namespace as the url. Now fixed. Signed-off-by: Timofei Larkin <[email protected]>
WalkthroughThe changes introduce improved resource management and endpoint formatting in the etcd client handling logic. In the etcd cluster controller, a deferred cleanup ensures that all etcd client connections are closed at the end of the reconciliation process. Additionally, the etcd client factory now closes the cluster client if single-endpoint client creation fails, and it updates endpoint URLs to use a fully qualified DNS format, enhancing clarity and correctness in endpoint addressing. TLS client authentication is supported by loading client certificates and server CA certificates from Kubernetes secrets when security is enabled. Changes
Poem
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 golangci-lint (1.64.8)level=warning msg="The linter 'exportloopref' is deprecated (since v1.60.2) due to: Since Go1.22 (loopvar) this linter is no longer relevant. Replaced by copyloopvar." Tip ⚡💬 Agentic Chat (Pro Plan, General Availability)
✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
internal/controller/etcdcluster_controller.go (1)
114-123
: Good cleanup implementation to prevent resource leaksThis deferred function properly cleans up all etcd client resources at the end of the reconciliation process. This ensures that both the cluster client and all single-endpoint clients are closed regardless of how the function returns, preventing memory leaks.
However, the return values from the
Close()
calls are not being checked.Consider capturing and logging errors from the Close() calls, especially if they could indicate potential issues:
defer func() { if clusterClient != nil { - clusterClient.Close() + if err := clusterClient.Close(); err != nil { + log.Debug(ctx, "error closing cluster client", "error", err) + } } for i := range singleClients { if singleClients[i] != nil { - singleClients[i].Close() + if err := singleClients[i].Close(); err != nil { + log.Debug(ctx, "error closing single client", "index", i, "error", err) + } } } }()🧰 Tools
🪛 golangci-lint (1.64.8)
116-116: Error return value of
clusterClient.Close
is not checked(errcheck)
120-120: Error return value of
(*go.etcd.io/etcd/client/v3.Client).Close
is not checked(errcheck)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
internal/controller/etcdcluster_controller.go
(1 hunks)internal/controller/factory/etcd_client.go
(2 hunks)
🧰 Additional context used
🪛 golangci-lint (1.64.8)
internal/controller/etcdcluster_controller.go
116-116: Error return value of clusterClient.Close
is not checked
(errcheck)
120-120: Error return value of (*go.etcd.io/etcd/client/v3.Client).Close
is not checked
(errcheck)
⏰ Context from checks skipped due to timeout of 90000ms (7)
- GitHub Check: test on k8s latest version
- GitHub Check: test on k8s previous version
- GitHub Check: test-e2e on k8s latest version
- GitHub Check: test on k8s penultimate version
- GitHub Check: test-e2e on k8s penultimate version
- GitHub Check: test-e2e on k8s previous version
- GitHub Check: pre-commit
🔇 Additional comments (2)
internal/controller/factory/etcd_client.go (2)
32-32
: Good fix for preventing resource leaksAdding
clusterClient.Close()
ensures that if any single-endpoint client creation fails, the cluster client is properly closed. This prevents resource leaks, addressing one of the memory leak issues mentioned in the PR objectives.
60-60
: Improved endpoint URL formatting with proper FQDNThis update properly constructs the endpoint URLs as fully qualified domain names using Kubernetes DNS naming conventions. This fixes the incorrect URL issue mentioned in the PR objectives, changing from simple hostnames to a format that includes the service name and namespace for proper resolution.
Signed-off-by: Timofei Larkin <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (1)
internal/controller/factory/etcd_client.go (1)
119-132
: CA certificate parsing function could improve error handlingThe
parseTLSSecretCA
function is generally well-implemented, but the error handling for PEM decoding could be improved.Consider enhancing the PEM decoding error handling:
- block, _ := pem.Decode(caPemBytes) + block, rest := pem.Decode(caPemBytes) if block == nil { err = fmt.Errorf("failed to decode PEM bytes") return } + if len(rest) > 0 { + // Optional: Log or handle remaining data if there are multiple certificates + }This change captures the remaining data and allows you to handle cases where the PEM might contain multiple certificates.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
internal/controller/factory/etcd_client.go
(3 hunks)
🧰 Additional context used
🪛 golangci-lint (1.64.8)
internal/controller/factory/etcd_client.go
36-36: Error return value of clusterClient.Close
is not checked
(errcheck)
🪛 ast-grep (0.31.1)
internal/controller/factory/etcd_client.go
[warning] 80-80: MinVersionis missing from this TLS configuration. By default, TLS 1.2 is currently used as the minimum when acting as a client, and TLS 1.0 when acting as a server. General purpose web applications should default to TLS 1.3 with all other protocols disabled. Only where it is known that a web server must support legacy clients with unsupported an insecure browsers (such as Internet Explorer 10), it may be necessary to enable TLS 1.0 to provide support. Add
MinVersion: tls.VersionTLS13' to the TLS configuration to bump the minimum version to TLS 1.3.
Context: tls.Config{}
Note: [CWE-327]: Use of a Broken or Risky Cryptographic Algorithm [OWASP A03:2017]: Sensitive Data Exposure [OWASP A02:2021]: Cryptographic Failures [REFERENCES]
https://owasp.org/Top10/A02_2021-Cryptographic_Failures
(missing-ssl-minversion-go)
⏰ Context from checks skipped due to timeout of 90000ms (4)
- GitHub Check: test-e2e on k8s penultimate version
- GitHub Check: test-e2e on k8s latest version
- GitHub Check: test-e2e on k8s previous version
- GitHub Check: pre-commit
🔇 Additional comments (3)
internal/controller/factory/etcd_client.go (3)
64-64
: Good improvement in endpoint URL formattingThe URL format has been correctly updated to use the fully qualified domain name, which addresses the issue with incorrect client URLs mentioned in the PR.
66-66
: Good addition of dial timeoutSetting a default dial timeout of 5 seconds is a good practice to prevent clients from hanging indefinitely when connecting to unavailable endpoints.
103-117
: Well-structured TLS certificate parsing functionThe
parseTLSSecret
function is well implemented with proper error checking and clear error messages.
Signed-off-by: Timofei Larkin <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
internal/controller/factory/etcd_client.go (1)
69-99
: 🛠️ Refactor suggestionMissing TLS MinVersion in the configuration
The TLS configuration is missing a MinVersion setting, which could expose the connection to older, less secure TLS protocol versions.
Add the MinVersion setting to the TLS configuration:
etcdClient.TLS = &tls.Config{} + etcdClient.TLS.MinVersion = tls.VersionTLS13 etcdClient.TLS.Certificates = []tls.Certificate{cert}
This ensures that only TLS 1.3 or newer will be used, which offers better security.
🧰 Tools
🪛 ast-grep (0.31.1)
[warning] 81-81: MinVersion
is missing from this TLS configuration. By default, TLS 1.2 is currently used as the minimum when acting as a client, and TLS 1.0 when acting as a server. General purpose web applications should default to TLS 1.3 with all other protocols disabled. Only where it is known that a web server must support legacy clients with unsupported an insecure browsers (such as Internet Explorer 10), it may be necessary to enable TLS 1.0 to provide support. Add
MinVersion: tls.VersionTLS13' to the TLS configuration to bump the minimum version to TLS 1.3.
Context: tls.Config{}
Note: [CWE-327]: Use of a Broken or Risky Cryptographic Algorithm [OWASP A03:2017]: Sensitive Data Exposure [OWASP A02:2021]: Cryptographic Failures [REFERENCES]
https://owasp.org/Top10/A02_2021-Cryptographic_Failures(missing-ssl-minversion-go)
🧹 Nitpick comments (1)
internal/controller/factory/etcd_client.go (1)
120-133
: Improve error handling in parseTLSSecretCAThe function structure is good, but there's a missing error check after pem.Decode.
block, _ := pem.Decode(caPemBytes) if block == nil { err = fmt.Errorf("failed to decode PEM bytes") return } + if block.Type != "CERTIFICATE" { + err = fmt.Errorf("PEM block is not a certificate (type: %s)", block.Type) + return + } ca, err = x509.ParseCertificate(block.Bytes)This additional check ensures the PEM block is actually a certificate before attempting to parse it.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
internal/controller/etcdcluster_controller.go
(3 hunks)internal/controller/factory/etcd_client.go
(3 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- internal/controller/etcdcluster_controller.go
🧰 Additional context used
🪛 ast-grep (0.31.1)
internal/controller/factory/etcd_client.go
[warning] 81-81: MinVersionis missing from this TLS configuration. By default, TLS 1.2 is currently used as the minimum when acting as a client, and TLS 1.0 when acting as a server. General purpose web applications should default to TLS 1.3 with all other protocols disabled. Only where it is known that a web server must support legacy clients with unsupported an insecure browsers (such as Internet Explorer 10), it may be necessary to enable TLS 1.0 to provide support. Add
MinVersion: tls.VersionTLS13' to the TLS configuration to bump the minimum version to TLS 1.3.
Context: tls.Config{}
Note: [CWE-327]: Use of a Broken or Risky Cryptographic Algorithm [OWASP A03:2017]: Sensitive Data Exposure [OWASP A02:2021]: Cryptographic Failures [REFERENCES]
https://owasp.org/Top10/A02_2021-Cryptographic_Failures
(missing-ssl-minversion-go)
⏰ Context from checks skipped due to timeout of 90000ms (7)
- GitHub Check: test-e2e on k8s previous version
- GitHub Check: test-e2e on k8s latest version
- GitHub Check: test-e2e on k8s penultimate version
- GitHub Check: test on k8s latest version
- GitHub Check: test on k8s penultimate version
- GitHub Check: test on k8s previous version
- GitHub Check: pre-commit
🔇 Additional comments (4)
internal/controller/factory/etcd_client.go (4)
36-37
: Correctly handled client close for resource cleanupGood to see the comment added to explain why the error is ignored. The
nolint:errcheck
directive is an acceptable approach here since the primary concern is resource cleanup rather than handling the error.This change helps fix the memory leak issue mentioned in the PR objectives by ensuring the cluster client is closed if any single-endpoint client creation fails.
65-65
: Correct URL format with fully qualified domain nameThe endpoint URL now properly uses the fully qualified domain name format (
name.headless-service.namespace.svc:port
) instead of just the hostname. This addresses the connectivity issue mentioned in the PR objectives.
67-67
: Added reasonable dial timeoutAdding a 5-second dial timeout is a good practice to prevent hanging connections when etcd nodes are unavailable.
104-118
: Well-structured TLS secret parsing functionThis helper function correctly extracts the TLS certificate and key from the Kubernetes secret. The error handling is comprehensive, with descriptive error messages.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Now fixed.
Summary by CodeRabbit
Bug Fixes
New Features
Refactor