Re: [PATCH bpf v2 8/8] libbpf: ringbuf: Prevent missed wakeups

From: Eduard Zingerman

Date: Fri Jun 19 2026 - 21:54:13 EST


On Thu, 2026-06-18 at 20:26 -0400, Tamir Duberstein wrote:
> After consuming the last visible record, ringbuf_process_ring()
> publishes the consumer position and checks the producer position. These
> operations lack a full StoreLoad barrier. A producer can therefore
> commit a new record but read the old consumer position while the
> consumer reads the old producer position. The producer sends no
> notification and the consumer waits despite a queued record.
>
> Insert a full barrier between publishing a consumer position and the
> next producer position load. When a record bound or callback ends the
> current invocation first, execute the barrier before returning so the
> load in a later invocation completes the same handshake.
>
> Add an edge-triggered epoll test that drains one record per call while a
> concurrent producer fills the ring. Without the barrier, a missed
> notification leaves the producer dropping records from a full ring while
> the consumer times out. Document that bounded consumers and callbacks
> that terminate consumption must drain before waiting again.
>
> Fixes: bf99c936f947 ("libbpf: Add BPF ring buffer support")
> Reported-by: Andrew Werner <awerner32@xxxxxxxxx>
> Reported-by: Sashiko <sashiko-bot@xxxxxxxxxx>
> Closes: https://lore.kernel.org/bpf/20260614015716.945AF1F000E9@xxxxxxxxxxxxxxx/
> Assisted-by: Codex:gpt-5.5
> Signed-off-by: Tamir Duberstein <tamird@xxxxxxxxxx>
> ---

Took me a while, but I agree there is a race here.
FWIW, here is the description of the race as far as I understand it.

Simplified pseudo-code for the consumer (ringbuf_process_ring)

cons_pos = load_acquire(consumer_pos); // op#1 [orders before 2,3,4,5]
do {
got_new_data = false;
prod_pos = load_acquire(producer_pos); // op#2 [orders before 3,4,5]
while (cons_pos < prod_pos) {
len_ptr = ... cons_pos ...;
len = load_acquire(len_ptr); // op#3 [orders before 4,5]
got_new_data = true;
callback(... cons_pos ...); // op#4
cons_pos += len;
store_release(consumer_pos, cons_pos); // op#5 [orders after 2,3,4; ordering relative to 2' not defined]
}
} while (got_new_data);

The patch fixes the issue with op#5, store_release operation is
ordered after operations 2,3,4, but it's ordering relative to op#2
from the next outer loop iteration (denoted as 2') is not defined.

Simplified pseudo-code for the producer (bpf_ringbuf_{reserve,commit})

// reserve
store_release(producer_pos, ...);

// commit
xchg(... clear BUSY_BIT ...);
cons_pos = load_acquire(consumer_pos);
if (cons_pos == rec_pos)
schedule_wakeup();

A possible sequence of events

Producer:

// submits a record such that:
consumer_pos = 0 (initial state)
producer_pos = 10

Consumer:

cons_pos = load_acquire(consumer_pos) -> cons_pos = 0
prod_pos = load_acquire(producer_pos) -> prod_pos = 10
len = load_acquire(len_ptr) -> len = 10
callback(...) -> (record at pos 0 consumed)
cons_pos += len -> cons_pos = 10
store_release(consumer_pos, cons_pos) -> consumer_pos = 10 (issued; global visibility deferred)
(cons_pos < prod_pos) 10 < 10 -> false, exit inner loop
while (got_new_data) -> true, re-enter outer loop
prod_pos = load_acquire(producer_pos) -> prod_pos = 10 (reads current value; record at pos 10 not published yet)
(cons_pos < prod_pos) 10 < 10 -> false, skip inner loop
while (got_new_data) -> false; consumer returns 0 and enters an epoll wait

Producer (reserves and submits a new record):

store_release(producer_pos, 20) -> producer_pos = 20 (record at pos 10 reserved)
xchg(... clear BUSY_BIT ...) -> (record at pos 10 committed)
cons_pos = load_acquire(consumer_pos) -> cons_pos = 0 (consumer's store of 10 not visible yet)
rec_pos = 10; cons_pos == rec_pos -> 0 == 10 false -> NO WAKEUP

Consumer:

(deferred store becomes visible) -> consumer_pos = 10 (too late)

---

Consumer exits while a new record is available in the buffer,
and the producer fails to notify the consumer via epoll.
The added barrier prevents such sequence of events.

For the fix:

Reviewed-by: Eduard Zingerman <eddyz87@xxxxxxxxx>

Please move the test to a separate commit, I'll review it a bit later.

[...]