Skip to content

Commit bcfa257

Browse files
author
Michael Agun
committed
Implement libbpf ringbuf synchronous API.
1 parent fa21b3e commit bcfa257

File tree

9 files changed

+1371
-91
lines changed

9 files changed

+1371
-91
lines changed

docs/RingBuffer.md

Lines changed: 118 additions & 65 deletions
Original file line numberDiff line numberDiff line change
@@ -14,15 +14,15 @@ This proposal adds support for synchronous callbacks (like libbpf on linux) and
1414

1515
Asynchronous callback consumer:
1616

17-
1. Call `ring_buffer__new` to set up callback with RINGBUF_FLAG_AUTO_CALLBACK specified.
18-
- On Linux synchronous callbacks are always used, so the new AUTO_CALLBACK flags are Windows-specific.
19-
- Note: automatic callbacks are the current default behavior, but eventually
20-
this will change with [#4142](https://github.com/microsoft/ebpf-for-windows/issues/4142) to match the linux behavior so should always be specified.
17+
1. Call `ebpf_ring_buffer__new` to set up callback with `EBPF_RINGBUF_FLAG_AUTO_CALLBACK` specified.
18+
- On Linux synchronous callbacks are always used, so the `EBPF_RINGBUF_FLAG_AUTO_CALLBACK` flag is Windows-specific.
19+
- Note: automatic callbacks were the original default behavior, but the default has been changed to be source-compatible with Linux.
2120
2. The callback will be invoked for each record written to the ring buffer.
2221

2322
Synchronous callback consumer:
2423

25-
1. Call `ring_buffer__new` to set up callback with RINGBUF_FLAG_NO_AUTO_CALLBACK specified.
24+
1. Call `ring_buffer__new` to set up callback (uses synchronous mode by default to match Linux).
25+
- Or call `ebpf_ring_buffer__new` with `EBPF_RINGBUF_FLAG_NO_AUTO_CALLBACK` for explicit synchronous mode.
2626
2. Call `ring_buffer__poll()` to wait for data if needed and invoke the callback on all available records.
2727

2828
Mapped memory consumer:
@@ -41,35 +41,28 @@ On linux `ring_buffer__poll()` and `ring_buffer__consume()` are used to invoke t
4141
`poll()` waits for available data (or until timeout), then consume all available records.
4242
`consume()` consumes all available records (without waiting).
4343

44-
Windows will initially only support `ring_buffer__poll()`, which can be called with a timeout of zero
45-
to get the same behaviour as `ring_buffer__consume()`.
44+
Windows now supports both `ring_buffer__poll()` and `ring_buffer__consume()`, with Linux-compatible behavior.
45+
`ring_buffer__consume()` is equivalent to calling `ring_buffer__poll()` with a timeout of zero.
4646

4747
#### Asynchronous callbacks
4848

49-
On Linux ring buffers currently support only synchronous callbacks (using poll/consume).
50-
In contrast, Windows eBPF currently supports only asynchronous ring buffer callbacks,
51-
where the callback is automatically invoked when data is available.
49+
On Linux ring buffers support only synchronous callbacks (using poll/consume).
50+
Windows eBPF now supports both synchronous callbacks (default, matching Linux) and asynchronous ring buffer callbacks.
5251

53-
This proposal adds support for synchronous consumers by setting the `RINGBUF_FLAG_NO_AUTO_CALLBACK` flag.
54-
With the flag set, callbacks will not automatically be called.
55-
To invoke the callback and `ring_buffer__poll()`
56-
should be called to poll for available data and invoke the callback.
57-
On Windows a timeout of zero can be passed to `ring_buffer__poll()` to get the same behaviour as `ring_buffer__consume()` (consume available records without waiting).
58-
59-
When #4142 is resolved the default behaviour will be changed from asynchronous (automatic) to synchronous callbacks,
60-
so `RINGBUF_FLAG_AUTO_CALLBACK` should always be specified for asynchronous callbacks for forward-compatibility.
52+
For synchronous callbacks (Linux-compatible), use the default behavior with `ring_buffer__new()`.
53+
For asynchronous callbacks (Windows-specific), use `ebpf_ring_buffer__new()` with the `EBPF_RINGBUF_FLAG_AUTO_CALLBACK` flag.
6154

6255
#### Memory mapped consumers
6356

6457
As an alternative to callbacks, Linux ring buffer consumers can directly access the
6558
ring buffer data by calling `mmap()` on a ring_buffer map fd to map the data into user space.
6659
`ring_buffer__epoll_fd()` is used on Linux to get an fd to use with epoll to wait for data.
6760

68-
Windows doesn't have directly compatible APIs to Linux mmap and epoll, so instead we will perfom the mapping
61+
Windows doesn't have directly compatible APIs to Linux mmap and epoll, so instead we perform the mapping
6962
in the eBPF core and use a KEVENT to signal for new data.
7063

71-
For direct memory mapped consumers on Windows, use `ebpf_ring_buffer_get_buffer` to get pointers to the producer and consumer
72-
pages mapped into user space, and `ebpf_ring_buffer_get_wait_handle()` to get the SynchronizationEvent (auto-reset) KEVENT
64+
For direct memory mapped consumers on Windows, use `ebpf_ring_buffer_map_map_buffer` to get pointers to the producer and consumer
65+
pages mapped into user space, and `ebpf_map_set_wait_handle()` to set a HANDLE
7366
to use with `WaitForSingleObject`/`WaitForMultipleObject`.
7467

7568
Similar to the linux memory layout, the first pages of the shared ring buffer memory are the "producer page" and "consumer page",
@@ -97,53 +90,95 @@ ebpf_result_t
9790
ebpf_ring_buffer_output(_Inout_ ebpf_ring_buffer_t* ring, _In_reads_bytes_(length) uint8_t* data, size_t length, size_t flags)
9891
```
9992
100-
_Note:_ The currently internal `ebpf_ring_buffer_record.h` with helpers for working with raw records will also be made public.
93+
**Note:** The currently internal `ebpf_ring_buffer_record.h` with helpers for working with raw records will also be made public.
10194
10295
#### Updated libbpf API for callback consumer
10396
104-
The default behaviour of these functions will be unchanged for now.
97+
The default behaviour of these functions has been updated to use synchronous callbacks to match Linux libbpf behavior.
98+
99+
Use `ring_buffer__new()` (defaults to synchronous mode) or `ebpf_ring_buffer__new()` with `EBPF_RINGBUF_FLAG_AUTO_CALLBACK` to set up automatic callbacks for each record.
100+
Use `ring_buffer__new()` (default behavior) or `ebpf_ring_buffer__new()` with `EBPF_RINGBUF_FLAG_NO_AUTO_CALLBACK` to set up synchronous callbacks that are invoked via `ring_buffer__poll()` or `ring_buffer__consume()`.
105101
106-
Use the existing `ring_buffer__new()` to set up automatic callbacks for each record.
107-
Call `ebpf_ring_buffer_get_buffer()` ([New eBPF APIs](#new-ebpf-apis-for-mapped-memory-consumer))
102+
Call `ebpf_ring_buffer_map_map_buffer()` ([New eBPF APIs](#new-ebpf-apis-for-mapped-memory-consumer))
108103
to get direct access to the mapped ring buffer memory.
109104
105+
For Windows-specific functionality, use the `ebpf_ring_buffer__*` variants which accept `ebpf_ring_buffer_opts` with flags.
106+
110107
```c
111108
struct ring_buffer;
112109
113110
typedef int (*ring_buffer_sample_fn)(_Inout_ void *ctx, _In_reads_bytes_(size) void *data, size_t size);
114111
115112
struct ring_buffer_opts {
116113
size_t sz; /* size of this struct, for forward/backward compatiblity */
114+
};
115+
116+
/* Windows-specific extended options */
117+
struct ebpf_ring_buffer_opts {
118+
size_t sz; /* size of this struct, for forward/backward compatiblity */
117119
uint64_t flags; /* ring buffer option flags */
118120
};
119121
120-
// Ring buffer manager options.
121-
// - The default behaviour is currently automatic callbacks, but may change in the future per #4142.
122+
// Ring buffer manager options (Windows-specific).
123+
// - The default behaviour is now synchronous callbacks to match Linux libbpf.
122124
// - Only specify one of AUTO_CALLBACKS or NO_AUTO_CALLBACKS - specifying both is not allowed.
123-
enum ring_buffer_flags {
124-
RINGBUF_FLAG_AUTO_CALLBACK = (uint64_t)1 << 0 /* Automatically invoke callback for each record */
125-
RINGBUF_FLAG_NO_AUTO_CALLBACK = (uint64_t)2 << 0 /* Don't automatically invoke callback for each record */
125+
enum ebpf_ring_buffer_flags {
126+
EBPF_RINGBUF_FLAG_AUTO_CALLBACK = (uint64_t)1 << 0, /* Automatically invoke callback for each record */
127+
EBPF_RINGBUF_FLAG_NO_AUTO_CALLBACK = (uint64_t)1 << 1, /* Don't automatically invoke callback for each record */
126128
};
127129
128130
#define ring_buffer_opts__last_field sz
131+
#define ebpf_ring_buffer_opts__last_field flags
129132
130133
/**
131-
* @brief Creates a new ring buffer manager.
134+
* @brief Creates a new ring buffer manager (Linux-compatible).
132135
*
136+
* Uses synchronous callbacks by default (matching Linux libbpf behavior).
133137
* Only one consumer can be attached at a time, so it should not be called multiple times on an fd.
134138
*
135139
* If the return value is NULL the error will be returned in errno.
136140
*
137141
* @param[in] map_fd File descriptor to ring buffer map.
138142
* @param[in] sample_cb Pointer to ring buffer notification callback function (if used).
139143
* @param[in] ctx Pointer to sample_cb callback function context.
140-
* @param[in] opts Ring buffer options.
144+
* @param[in] opts Ring buffer options (currently unused, should be NULL).
141145
*
142146
* @returns Pointer to ring buffer manager.
143147
*/
144148
struct ring_buffer *
145149
ring_buffer__new(int map_fd, ring_buffer_sample_fn sample_cb, _Inout_ void *ctx,
146-
_In_ const struct ring_buffer_opts *opts);
150+
_In_opt_ const struct ring_buffer_opts *opts);
151+
152+
/**
153+
* @brief Creates a new ring buffer manager (Windows-specific with flags).
154+
*
155+
* Only one consumer can be attached at a time, so it should not be called multiple times on an fd.
156+
*
157+
* If the return value is NULL the error will be returned in errno.
158+
*
159+
* @param[in] map_fd File descriptor to ring buffer map.
160+
* @param[in] sample_cb Pointer to ring buffer notification callback function (if used).
161+
* @param[in] ctx Pointer to sample_cb callback function context.
162+
* @param[in] opts Ring buffer options with Windows-specific flags.
163+
*
164+
* @returns Pointer to ring buffer manager.
165+
*/
166+
struct ring_buffer *
167+
ebpf_ring_buffer__new(int map_fd, ring_buffer_sample_fn sample_cb, _Inout_ void *ctx,
168+
_In_opt_ const struct ebpf_ring_buffer_opts *opts);
169+
170+
/**
171+
* @brief Add another ring buffer map to the ring buffer manager.
172+
*
173+
* @param[in] rb Ring buffer manager.
174+
* @param[in] map_fd File descriptor to ring buffer map.
175+
* @param[in] sample_cb Pointer to ring buffer notification callback function.
176+
* @param[in] ctx Pointer to sample_cb callback function context.
177+
*
178+
* @retval 0 Success.
179+
* @retval <0 Error.
180+
*/
181+
int ring_buffer__add(struct ring_buffer *rb, int map_fd, ring_buffer_sample_fn sample_cb, void *ctx);
147182
148183
/**
149184
* @brief poll ringbuf for new data
@@ -152,15 +187,26 @@ ring_buffer__new(int map_fd, ring_buffer_sample_fn sample_cb, _Inout_ void *ctx,
152187
* If timeout_ms is zero, poll will not wait but only invoke the callback on records that are ready.
153188
* If timeout_ms is -1, poll will wait until data is ready (no timeout).
154189
*
155-
* This function is only supported when automatic callbacks are disabled (see RINGBUF_FLAG_NO_AUTO_CALLBACK).
190+
* This function is only supported when automatic callbacks are disabled.
156191
*
157192
* @param[in] rb Pointer to ring buffer manager.
158193
* @param[in] timeout_ms Maximum time to wait for (in milliseconds).
159194
*
160-
* @returns Number of records consumed, INT_MAX, or a negative number on error
195+
* @returns Number of records consumed, or a negative number on error
161196
*/
162197
int ring_buffer__poll(_In_ struct ring_buffer *rb, int timeout_ms);
163198
199+
/**
200+
* @brief consume available records without waiting
201+
*
202+
* Equivalent to ring_buffer__poll() with timeout_ms=0.
203+
*
204+
* @param[in] rb Pointer to ring buffer manager.
205+
*
206+
* @returns Number of records consumed, or a negative number on error
207+
*/
208+
int ring_buffer__consume(_In_ struct ring_buffer *rb);
209+
164210
/**
165211
* @brief Frees a ring buffer manager.
166212
*
@@ -219,6 +265,7 @@ typedef struct _ebpf_ring_buffer_producer_page
219265
* Multiple calls will return the same pointers, as the ring buffer manager only maps the ring once.
220266
*
221267
* @param[in] rb Pointer to ring buffer manager.
268+
* @param[in] index Index of the map in the ring buffer manager (0-based).
222269
* @param[out] producer_page Pointer to start of read-only mapped producer page.
223270
* @param[out] consumer_page Pointer to start of read-write mapped consumer page.
224271
* @param[out] data Pointer to start of read-only double-mapped data pages.
@@ -229,6 +276,7 @@ typedef struct _ebpf_ring_buffer_producer_page
229276
*/
230277
ebpf_result_t ebpf_ring_buffer_get_buffer(
231278
_In_ struct ring_buffer *rb,
279+
_In_ uint32_t index,
232280
_Out_ ebpf_ring_buffer_consumer_page_t **consumer_page,
233281
_Out_ const ebpf_ring_buffer_producer_page_t **producer_page,
234282
_Outptr_result_buffer_(*data_size) const uint8_t **data,
@@ -399,23 +447,20 @@ For(;;) {
399447
Exit:
400448
```
401449

402-
#### Polling ring buffer consumer (using ringbuf manager)
450+
#### Polling ring buffer consumer (using ringbuf manager, matches Linux code)
403451

404452
```c
405453
// sample callback
406454
int ring_buffer_sample_fn(void *ctx, void *data, size_t size) {
407455
// … business logic to handle record …
456+
return 0;
408457
}
409458

410459
// consumer code
411-
struct ring_buffer_opts opts;
412-
opts.sz = sizeof(opts);
413-
opts.flags = RINGBUF_FLAG_NO_AUTO_CALLBACK; //no automatic callbacks
414-
415460
fd_t map_fd = bpf_obj_get(rb_map_name.c_str());
416461
if (map_fd == ebpf_fd_invalid) return 1;
417462

418-
struct ring_buffer *rb = ring_buffer__new(map_fd, ring_buffer_sample_fn sample_cb, &opts);
463+
struct ring_buffer *rb = ring_buffer__new(map_fd, ring_buffer_sample_fn, nullptr, nullptr);
419464
if (rb == NULL) return 1;
420465

421466
// now loop as long as there isn't an error
@@ -426,10 +471,40 @@ while(ring_buffer__poll(rb, -1) >= 0) {
426471
ring_buffer__free(rb);
427472
```
428473
474+
#### Asynchronous ring buffer consumer (Windows-specific)
475+
476+
```c
477+
// sample callback - this will be called automatically for each record
478+
int ring_buffer_sample_fn(void *ctx, void *data, size_t size) {
479+
// … business logic to handle record …
480+
return 0;
481+
}
482+
483+
// consumer code
484+
fd_t map_fd = bpf_obj_get(rb_map_name.c_str());
485+
if (map_fd == ebpf_fd_invalid) return 1;
429486
430-
### Linux consumer examples (for comparison)
487+
// Set up Windows-specific ring buffer options for automatic callbacks
488+
struct ebpf_ring_buffer_opts opts = {};
489+
opts.sz = sizeof(opts);
490+
opts.flags = EBPF_RINGBUF_FLAG_AUTO_CALLBACK; // Enable automatic callbacks
431491
432-
#### Linux direct mmap consumer
492+
struct ring_buffer *rb = ebpf_ring_buffer__new(map_fd, ring_buffer_sample_fn, nullptr, &opts);
493+
if (rb == NULL) return 1;
494+
495+
// With automatic callbacks, the callback function is invoked immediately
496+
// when each record is written to the ring buffer. No polling is needed.
497+
// The ring buffer manager handles the processing automatically.
498+
499+
// Keep the application running while callbacks are processed
500+
// (actual application logic would determine when to exit)
501+
Sleep(60000); // Sleep for 60 seconds or until application should exit
502+
503+
ring_buffer__free(rb);
504+
```
505+
506+
507+
### Linux direct mmap consumer example (for comparison)
433508

434509
```c
435510
size_t page_size = 4096;
@@ -538,28 +613,6 @@ munmap(producer, mmap_sz);
538613
close(map_fd);
539614
```
540615
541-
#### Linux ring buffer manager consumer
542-
543-
```c
544-
// sample callback
545-
int ring_buffer_sample_fn(void *ctx, void *data, size_t size) {
546-
// … business logic to handle record …
547-
}
548-
549-
fd_t map_fd = bpf_obj_get(rb_map_name.c_str());
550-
if (map_fd == ebpf_fd_invalid) return 1;
551-
552-
struct ring_buffer *rb = ring_buffer__new(map_fd, ring_buffer_sample_fn sample_cb, NULL);
553-
if (rb == NULL) return 1;
554-
555-
// now loop as long as there isn't an error
556-
while(ring_buffer__poll(rb, -1) >= 0) {
557-
// data processed by event callback
558-
}
559-
560-
ring_buffer__free(rb);
561-
```
562-
563616
*Below implementation details of the internal ring buffer data structure are discussed.*
564617
565618
## Internal Ring Buffer

ebpfapi/Source.def

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -142,6 +142,8 @@ EXPORTS
142142
ebpf_program_query_info
143143
ebpf_program_synchronize
144144
ebpf_ring_buffer__new
145+
ebpf_ring_buffer_get_buffer
146+
ebpf_ring_buffer_get_wait_handle
145147
ebpf_ring_buffer_map_map_buffer
146148
ebpf_ring_buffer_map_unmap_buffer
147149
ebpf_ring_buffer_map_write
@@ -160,5 +162,8 @@ EXPORTS
160162
libbpf_strerror
161163
perf_buffer__free
162164
perf_buffer__new
165+
ring_buffer__add
166+
ring_buffer__consume
163167
ring_buffer__free
164168
ring_buffer__new
169+
ring_buffer__poll

0 commit comments

Comments
 (0)