Skip to content

Conversation

@vector-of-bool
Copy link
Contributor

@vector-of-bool vector-of-bool commented Jul 24, 2025

Refer: CDRIVER-6048

Background

This is a preliminary work for CSOT, to give us a stable foundation for working with time going forward.

The C driver has had some duration/time types (mcd_time_point and mcd_duration), but they weren't widely used in the codebase and could use some TLC. This PR description will only present a high-level overview, as much of the implementation explanation has been added as commentary within the source files themselves.

This PR also updates one core internal submodule to use the new time APIs, as a test-run of deploying them into the codebase in preparation for CSOT.

Fortunately, the vast majority of code additions in this changeset are explanatory comments.

New Time APIs

The new time types are similar to the previous mcd_ time types:

  • struct mlib_duration represents a (possibly negative) difference between two points in time, with a fixed resolution (currently it has microsecond granularity). This is meant to replace any plain integers in the codebase that represent durations as a count of units. Using plain integers is problematic since the associated time unit cannot be statically enforced, and doing arithmetic on plain integers is inherently unsafe. The mlib_duration arithmetic is fully well-defined to use saturating arithmetic, so attempting to make a "too large" duration value will simply clamp to the nearest representable duration. While this is potentially problematic in extreme conditions, it only appears when juggling durations of hundreds of millennia, which reconnects Russia+Alaska and deletes Japan.
  • struct mlib_time_point represents a fixed point-in-time. Currently the implementation is written against the program's monotonic clock. It is encoded as an mlib_duration relative to the unspecified monotonic clock epoch. Because the epoch is unspecified, this type cannot be reliably converted to any "IRL" time point. (I tried writing an lldb pretty-printer to do this for debugging purposes, but it turns out to be very non-trivial and platform-dependent.)
    • The main API surface for interacting with the underlying clock is mlib_now() which simply returns a time-point for the moment that the function is called.
    • The mlib_now() function actually replaces the implementation of bson_get_monotonic_time function, which incidentally fixes CDRIVER-4526, since mlib_now uses the Win32 fine-grained monotonic clock.
    • Adding support for different clocks would be possible, but would require that the struct carry a clock identifier, and would make things significantly more complicated, as doing comparisons/differences between time points of different clocks requires knowing how the clock epochs are related, which, as mentioned, is non-trivial, enough that no one really wants to do this. For C++, for example, there is no stdlib conversation between the monotonic clock time points and the "IRL" clock time points.
  • mlib_timer represents a deadline. This simply stores an "expires-at" mlib_time_point. The main reason for this being a separate type is to provide distinct types and functions specific to manipulating and inspecting deadlines.

Creating and Manipulating Durations

The initial duration API was extremely verbose, and looked like:

int foo(int a, int b) {
  mlib_duration d = mlib_duration_add(mlib_milliseconds(a), mlib_seconds(b));
}

while this works, its incredibly tedious to write. Instead, an mlib_duration() function-like macro was added, which can be used used with two arguments to create a duration from a unit count, or three arguments to do duration arithmetic, and supports nested argument lists:

int foo(int a, int b) {
  mlib_duration d = mlib_duration((a, ms), plus, (b, sec));
}

See the doc comments in mlib/duration.h for an explanation. The macro trickery isn't pretty, but it is heavily commented for future maintainers.

Usage in mongoc-async-cmd

The mongoc-async-cmd component was chosen arbitrarily as a test bed for the new APIs. Unfortunately, the code in there was heavily under-commented and used non-descriptive struct member names, so a lot of effort was spent just deducing what it is actually trying to do with the dozen int64_t values it was using for timeouts/timestamps/deadlines.

A heavy amount of comments and API renames have been applied to the module, to ensure that we're doing what we expect, and so future refactors won't need to spend a whole 3+ days trying to deduce what everything means. This doesn't give me a great outlook on future CSOT refactor work, depending on whether all time-handling code is similarly obtuse or I just happened to pick a module that was particularly problematic.

Additionally, my refactor of mongoc-async-cmd code led to the discovery that command deadlines can be violated (up to 2x or possibly greater) because of a hack that was added for CDRIVER-1571. Since this is a core module underlying all command execution, refactors will be required to prevent this if we want a successful CSOT implementation. An inline function doc-comment explains the problem.

Review Order

Initially, there was attempt to do isolated commits in logical order, but it somewhat broke down towards the end. Instead, it is recommended to review the final changes in the following order:

  1. mlib/platform.h is added to just #include some common platform headers "correctly", since that's actually non-trivial.
  2. mlib/duration.h is the basis of all time functions.
  3. mlib/time_point.h builds upon duration to create points-in-time.
  4. mlib/timer.h a simpler file that adds deadline-specific functions.
  5. The mongoc-async and mongoc-async-cmd files, which refactor to use the time types.
  6. Everything else: There are several other minor changes across the codebase.

This is intended to replace `mcd_duration` as a more
capable duration type.
This uses preprocessor trickery to support more
terse conversion and arithmetic of duration types.
The following changes are made, in order of significance:

1. The `mongoc_stream_poll` function declaration now says the units on its
   timeout.
2. Add a `_mongoc_stream_poll_internal` that takes a deadline timer rather than
   a timeout duration.
3. Move async-command typedefs into the async-command header.
4. async command creation now takes typed durations instead of integers
5. Async command internally uses timers, time points, and durations,
   rather than juggling `int64_t` for everything.
7. Heavy comments in async-cmd files are added all over the place to explain
   what is going on.
Copy link
Collaborator

@kevinAlbs kevinAlbs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with a possible overflow check.

Copy link
Contributor

@eramongodb eramongodb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some unaddressed feedback remaining (+ ClangFormat); otherwise, LGTM.

@vector-of-bool vector-of-bool changed the title [CDRIVER-6048] Add New Time and Duration Functionality [CDRIVER-6048][CDRIVER-4526] Add New Time and Duration Functionality Aug 15, 2025
@vector-of-bool vector-of-bool merged commit 4285d76 into mongodb:master Aug 15, 2025
38 of 42 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants