-
Notifications
You must be signed in to change notification settings - Fork 1.6k
KEP-5497: Add new imagePullPolicy mode named IfNotNewerPresent #5499
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
KEP-5497: Add new imagePullPolicy mode named IfNotNewerPresent #5499
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: nikos445 The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Welcome @nikos445! |
Hi @nikos445. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
// Get remote image id (digest) | ||
remoteImageRef, _ := GetRemoteImageDigestWithoutPull(ctx, requestedImage, pullSecrets) | ||
|
||
// Compare digests if not match then pull image from remote | ||
if remoteImageRef != imageRef { | ||
msg = fmt.Sprintf("Remote image for %q changed (local=%s, remote=%s), will pull new one.", requestedImage, imageRef, remoteImageRef) | ||
m.logIt(objRef, v1.EventTypeWarning, events.FailedToInspectImage, logPrefix, msg, klog.Warning) | ||
return "", msg, err | ||
} | ||
|
||
msg = fmt.Sprintf("Remote image for %q is exactly the same (local=%s, remote=%s), using existing one.", requestedImage, imageRef, remoteImageRef) | ||
m.logIt(objRef, v1.EventTypeWarning, events.FailedToInspectImage, logPrefix, msg, klog.Warning) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm curious about how this simple digest equality check can guarantee the image on registry is newer one.
If I understood correctly, there is basically no concept of newer/older over container images, and container images are just identified with a digest hash at the bottom.
Thus, I think there could be complicated scenarios (e.g., rollback operation, registry was restored from backup after some severe failures, etc.) that might not fit with the meaning of Newer.
So, we could consider better name for this policy behaviour? For example, IfNoUpdatedPresent
? I know this is not a good one though...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for suggestion, how about: IfDiffNotPresent
? Since it checks for difference in reality. Reverting from backup hasn't crossed my mind that might be older so 'newer' is not right keyword, you are right.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another problem is that if this policy can accept older images ( I imagined the case which the registory was restored from backup), in such a case, the policy could not achieve backported/pached image user-story (Story 2).
In this case, even backported/pached image was already mirrored on the node, older image(not backporeted/patched) can be pulled from the registry...
Changing the policy name could fit with this behavior. But, is this behavior really desired and wanted?
Thus, now, I'm thinking that it's not a good idea to support such anti-pattern practice, as discussed in kubernetes/kubernetes#133680.
As discussed in kubernetes/kubernetes#133680 and you are already aware, Always
policy can cover most user stories described in the KEP. The exception case is Always
policy with "cached locally but non-exist on registry" case as your described in slack( https://kubernetes.slack.com/archives/C0BP8PW9G/p1756226046849689?thread_ts=1756224253.541639&cid=C0BP8PW9G).
So, how about elaborating this KEP to focus on the case??
how about: IfDiffNotPresent ?
how about IfDigestChanged
?? We don't need to use the word of Present
, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I would like mostly to focus on that use case where pods may have locally an image and not starting because Always
failed to pull.
This is exactly what I try to implement here and if solved in policy Always
, then no need to have a separate IfNewerNotPresent
policy. Also we could implement something to check digest with Always
before pulling, to make it even faster than trying to verify all layers (but up to KEP approval)
I just started this discussion with another imagePullPolicy since I wanted to not modify core Kubernetes functionality.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also we could implement something to check digest with Always before pulling to make it even faster than trying to verify all layers (but up to KEP approval)
Sounds nice! I'd say this could separate into two independent KEPs:
- Make
Always
faster by just checking its digest- This would also need KEP, but this would be ok to update only CRI
- no user facing changes (just speedup), right?
- Make
Always
resilient for the case which "cached locally, but not-exist on remote"- This would introduce a new image pull policy because it's a behavioral change
I just started the discussion with another imagePullPolicy since I wanted to not modify core Kubernetes functionality.
could you kindly give me a pointer to this discussion? I'd like to watch :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry its a typo "this discussion" instead of "the discussion" :) I meant this KEP-5497
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry forgot to mention, I like your suggestion "IfDigestChanged"
|
||
This KEP proposes the introduction of a new `imagePullPolicy` mode named `IfNewerNotPresent`. The desired outcome is to provide a more efficient and | ||
predictable image-pulling strategy that ensures workloads use the latest available image when updates exist, while avoiding unnecessary downloads and | ||
startup delays. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if the image repository is unreachable?
- With
Always
, the Pod fails to start. - With
IfNotPresent
, the Pod can still start.
Generally, is forcing a pull necessary if the image ID has not changed?
- Probably not.
- Would this be considered a bug in
Always
? (Most likely not.) - To me,
IfNewerNotPresent
feels closer to anAlwaysCheck
mode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exactly, IfNewerNotPresent
checks image id if not same as local then it pulls (like Always) if same then it doesnt change anything (like IfNotPresent) and if registry fails or some connection is not possible then it falls back to local image after some retries (like IfNotPresent).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exactly,
IfNewerNotPresent
checks image id if not same as local then it pulls (like Always) if same then it doesnt change anything (like IfNotPresent) and if registry fails or some connection is not possible then it falls back to local image after some retries (like IfNotPresent).
couple comments...
- the terminology is even more confusing than always/ifnotpresent/never .. as you are injecting an extra always pull behavior but reducing it's rule to force a check only if the connection is "currently" available.. if not run pod/container anyway..
- Always is also a security check. All you would need to do is pull the ethernet cable to avoid the security check.
- prefer to not overload the always pull image policy with a simple rule allows fall back to if not present on bad connection/access denied issues.
I recommend considering a different route. Something that adds a new rule type with directives that decide what to do on connection failure, access denied, image tag does not exist, ... type issues between the local container image service and the image registry. But IMO that can probably only apply to the if not present rule... otherwise always no longer means always. Some clouds run with the injector that changes the image pull policy to always for multi-tenant use.
IMO the if not present determination could be subject to further clarification so long as the default determination is backwards compatible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey Mike,
thanks for the feedback. 😀
So as I understand, you suggest to "invent" a new variable in spec like imagePullPolicy
for covering the use cases of fallbacks, e.g imagePullFallbackPolicy
.
With modes like:
- if it fails to pull image from remote, then to use local.
- if it fails, to stay on imagePullBackoff.
- etc...
Thats a great idea, but this doesn't cover the cases of mutable images.
My idea, about IfNewerNotPresent
, was about pulling if images, if Digest doesn't match. Like when someone pushed a "latest" or "staging" tag.
That was the initial suggestion for this rule (IfNewerNotPresent
), the secondary idea is the fallback, to Local image, if remote image is not available or if it already exists (like IfNotPresent does).
I suggested a different rule, so we can maintain backwards compatibility and the rules: "Always" and "IfNotPresent" intact. For people that need it.
My use cases described in KEP are not covered fully by Always (due to not checking checksum and fallback) and not at all by the IfNotPresent (doesnt check anything - always use local)
I think your suggestion might be interesting case for another KEP even, but not relevant to this imagePullPolicy mode enhancement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
About the naming, I am happy to hear any suggestions people have to improve it 😁 I understand its not very clear but I believe good Documentation will solve this matter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where I said "image tag does not exist, ..." tag deleted or no longer present on the registry is also a type of image tag mutation. By ..
I mean that reason and any other reason / rule for clarifying what present and/or not present means with respect to image tagging and other image meta data, such as present and recent(when was the last time a tag was checked within 1hr, 1day, 1week,...), present and authorized (already covered by ensure secret pulled images), ... Not an exhaustive list. If a very fat image layer is already being downloaded do you want to run the pod on the older image or wait. Would there be a policy for upgrading the local image and using that updated image when it has been retrieved. I believe this was one of the patterns being discussed in the linked issue.
As to not checking checksum.. do you mean you think something is amiss with the runtime's process for verifying SHA digests of the container image manifests, config, layers, ...?
my reason for suggesting a different path is that the image pull policy already has a few hidden dimensions adding another might make it even more complicated and risky...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
naming is .. hard
maybe ImageResolvePolicy
Uh oh!
There was an error while loading. Please reload this page.