- 
          
- 
                Notifications
    You must be signed in to change notification settings 
- Fork 1.3k
Fix for retry cancellation #2456
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| Codecov ReportAll modified and coverable lines are covered by tests ✅ 
 Additional details and impacted files@@           Coverage Diff           @@
##             main    #2456   +/-   ##
=======================================
  Coverage   85.35%   85.36%           
=======================================
  Files         312      312           
  Lines        7465     7466    +1     
  Branches     1121     1121           
=======================================
+ Hits         6372     6373    +1     
  Misses        908      908           
  Partials      185      185           
 Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. | 
      
        
              This comment has been minimized.
        
        
      
    
  This comment has been minimized.
| try | ||
| { | ||
| try | ||
| context.CancellationToken.ThrowIfCancellationRequested(); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Calling the ThrowIfCancellationRequested here might be too late.  The OnRetry telemetry event is reported and the OnRetry callback is already executed. But there will be no new retry attempt if cancellation is requested.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I considered this. The event may be named "on retry," but the actual meaning is more like "on handled by the retry policy." Currently, any outcome that is not returned to the caller is considered "handled" and triggers this event. Users don't want to wait for the next attempt (which may or may not happen). We want to know the strategy was triggered since this tells us the callback completed and why the outcome was not returned.
Cancellation in .NET is cooperative. It's normal for code to finish doing certain important tasks until it comes to a better "stopping place" to acknowledge the cancellation. Logging/telemetry for what has just happened I think falls into this category. I would be open to skipping the "delay" calculation and logging a zero if cancellation is triggered, but I'm also not convinced this adds much value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As of now the strategy works like this on high level for a retry attempt (happy path):
- user provided callback is executed
- telemetry event is reported
- delay is calculated
- onretry is executed
- delay is waited
The execution can be stopped due to the following circumstances: the outcome is not handled by the strategy, the attempts are exhausted. From one side treating the cancellation in a different way feels a bit odd. But I agree that if the user provided callback executed then the telemetry and OnRetry hook should be performed as well because they allow the consumers to get insights what happened.
The OnRetryArguments serves multiple purposes. It tells about the past (outcome, duration, etc.) but also shares some information about the future (delay). You can access the information whether the cancellation was requested via the context (context.CancellationToken.IsCancellationRequested) but since it is not a top-level field I have doubts that anyone has ever checked it. IMHO making this information as a top-level field would make the 0/-1 delay more meaningful by providing contextual information.
IMHO the best would be to have something like this:
flowchart TD
    Args[OnRetryAgruments]
    Current[CurrentAttempt]
    Next[NextAttempt]
    Args --> Current
    Args --> Next
    Current --> Duration
    Current --> Outcome
    Current --> A[etc.]
    Next --> IsCancelled
    Next --> Delay
    Next --> b[etc.]
    Maybe in V9 😛
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When considering this particular fix, the things I am looking for are:
- The old outcome is correctly disposed if the new one is produced.
- The retry outcome (either success, error, cancellation) is properly reported and no telemetry associated with produced outcome is lost.
If the cancellation happens even after we said we will do next attempt, that's ok with me. That's the nature of cancellation actually :)
| ResilienceContextPool.Shared.Get(cancellation.Token), | ||
| default(object)); | ||
|  | ||
| result.Exception.ShouldBeNull(); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we assert that the result from the delegate is indeed returned?
| Thanks for your contribution @kmcclellan - the changes from this pull request have been published as part of version 8.5.2 📦, which is now available from NuGet.org 🚀 | 
Pull Request
The issue or feature being addressed
[Bug]: Inconsistent behavior when canceling a retry
Details on the issue fix or feature implementation
Despite how the conversations around the previous attempt, only a small change is needed to make this work in a way that is coherent.
Note my choice to avoid throwing until after delay is calculated and the retry event is logged. I think this is most likely what a user will expect, since every failed attempt currently generates such a log, and the outcome was "handled" in the sense that it was prevented from surfacing.
This also takes care of @martintmk's concern about disposables, since
DisposeHelperis also called before cancellation is acknowledged, just as though we were continuing with the retry. I've added an extra test case for this too.Confirm the following