feat: implement retries #103

alexbrt · 2025-07-25T10:28:51Z

Azure Application Insights frequently times out, returns HTTP 503 errors, or only accepts partial payloads. This PR adds support for configuring a backon-based backoff strategy to automatically retry telemetry uploads in such cases.

frigus02

Thanks a lot for looking into this. This is great.

src/uploader.rs

src/lib.rs

frigus02 · 2025-08-04T16:59:01Z

Sorry for the slow replies. I was pretty swamped the last days. I'll try to get to this later today or latest on Wednesday.

frigus02

I'm happy with this overall. I think a test would be great if possible. Though I'm not sure if the test framework allows that currently. I'm happy to take a stab at that over the weekend.

src/lib.rs

Cargo.toml

The Mutex is already acting as a box.

frigus02 · 2025-08-09T16:05:53Z

Added tests. I think this is good to merge now. I'll try to release a new version over the weekend

Cargo.toml

This reverts commit 68348f9.

Azure Application Insights frequently times out, returns HTTP 503 errors, or only accepts partial payloads. This adds transparent retries. Users of this crate won't have to configure anything. Using `backon` crate, because it seems small and easy to use. Design decisions: - Don't expose implementation. Add default retries that should work for everyone: - min_delay = 500ms - max_delay = 5s - with jitter (to avoid thundering herds when multiple exports run in parallel) We wanted no max_times or total delay, because the BatchSpanProcessor already provides a `max_export_timeout` option. This automatically ensures that the exporter will stop. However, it's only available in the experimental async version of the BatchSpanProcessor. And it's missing in the SimpleSpanProcessor and metric and logs processors. Therefore this sets a total delay of 35s. - Add a `with_retry_notify` option. Can be used to log retries and thereby debug exporter flakyness. - Use `futures-timer` crate to provide sleep functionality during retries. Ideally we could use opentelemetry_sdk::runtime::Runtime::delay for that. But that's behind an experimental feature and we the SDK doesn't provide the exporters with a runtime. We'd have to add a configuration to the exporter where users would need to specify their choosen runtime, which then has to fit to the rest of their setup, otherwise they see runtime errors. E.g. tokio-sleep fails if not executed in the context of a Tokio reactor with: there is no reactor running, must be called from the context of a Tokio 1.x runtime This error would also only happen if a retry has to happen, which means when a call to AppInsights fails. This would make it very hard to debug. We're using futures-timer instead, which works with all runtimes, including tokio and futures-executor. The OpenTelemetry SDK uses the latter for most default processors right now. --------- Co-authored-by: Jan Kuehle <[email protected]>

frigus02 · 2025-08-09T19:00:10Z

Published with version 0.42.0.

I hope I didn't remove anything you needed in the commits today. If I did, please let me know. And either way, thanks a lot for getting this started! This has been on my todo list for far too long.

alexbrt added 2 commits July 9, 2025 14:17

feat(uploader): add support for retrying requests with backoff

5fc36dd

feat(uploader): add retry notify

a9e7daa

frigus02 reviewed Jul 25, 2025

View reviewed changes

src/uploader.rs Outdated Show resolved Hide resolved

src/lib.rs Outdated Show resolved Hide resolved

frigus02 linked an issue Jul 25, 2025 that may be closed by this pull request

Retry behaviour #28

Closed

frigus02 reviewed Jul 26, 2025

View reviewed changes

src/lib.rs Show resolved Hide resolved

alexbrt force-pushed the main branch 2 times, most recently from 995ab29 to 6d833f0 Compare July 30, 2025 10:46

address review

19b0af7

alexbrt force-pushed the main branch from 6d833f0 to 19b0af7 Compare July 30, 2025 10:48

alexbrt requested a review from frigus02 July 30, 2025 11:00

frigus02 reviewed Aug 6, 2025

View reviewed changes

src/lib.rs Outdated Show resolved Hide resolved

src/lib.rs Outdated Show resolved Hide resolved

Cargo.toml Outdated Show resolved Hide resolved

frigus02 added 4 commits August 9, 2025 15:09

remove tokio dep

9cc766d

Remove unnecessary Box

3f78015

The Mutex is already acting as a box.

cleanup after fixes

b095e53

add tests

a845a9f

frigus02 reviewed Aug 9, 2025

View reviewed changes

Cargo.toml Outdated Show resolved Hide resolved

use futures timer

6ca1f0c

frigus02 merged commit 68348f9 into frigus02:main Aug 9, 2025
2 checks passed

frigus02 added a commit that referenced this pull request Aug 9, 2025

Revert "feat: implement retries (#103)"

458a841

This reverts commit 68348f9.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: implement retries #103

feat: implement retries #103

Uh oh!

alexbrt commented Jul 25, 2025

Uh oh!

frigus02 left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

frigus02 commented Aug 4, 2025

Uh oh!

frigus02 left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

frigus02 commented Aug 9, 2025

Uh oh!

Uh oh!

Uh oh!

frigus02 commented Aug 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat: implement retries #103

feat: implement retries #103

Uh oh!

Conversation

alexbrt commented Jul 25, 2025

Uh oh!

frigus02 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

frigus02 commented Aug 4, 2025

Uh oh!

frigus02 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

frigus02 commented Aug 9, 2025

Uh oh!

Uh oh!

Uh oh!

frigus02 commented Aug 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants