Re: [PATCH v2 1/2] rust: poll: make PollCondVar upgradable

From: Boqun Feng

Date: Wed Mar 04 2026 - 18:37:40 EST

On Wed, Mar 04, 2026 at 09:37:45PM +0000, Alice Ryhl wrote:
> On Wed, Mar 04, 2026 at 08:29:12AM -0800, Boqun Feng wrote:
> > On Wed, Mar 04, 2026 at 07:59:59AM +0000, Alice Ryhl wrote:
> > [...]
> > > > > + // If a normal waiter registers in parallel with us, then either:
> > > > > + // * We took the lock first. In that case, the waiter sees the above cmpxchg.
> > > > > + // * They took the lock first. In that case, we wake them up below.
> > > > > + drop(lock.lock());
> > > > > + self.simple.notify_all();
> > > >
> > > > Hmm.. what if the waiter gets its `&CondVar` before `upgrade()` and use
> > > > that directly?
> > > >
> > > > <waiter> <in upgrade()>
> > > > let poll_cv: &UpgradePollCondVar = ...;
> > > > let cv = poll_cv.deref();
> > > > cmpxchg();
> > > > drop(lock.lock());
> > > > self.simple.notify_all();
> > > > let mut guard = lock.lock();
> > > > cv.wait(&mut guard);
> > > >
> > > > we still miss the wake-up, right?
> > > >
> > > > It's creative, but I particularly hate we use an empty lock critical
> > > > section to synchronize ;-)
> > >
> > > I guess instead of exposing Deref, I can just implement `wait` directly
> > > on `UpgradePollCondVar`. Then this API misuse is not possible.
> > >
> >
> > If we do that,then we can avoid the `drop(lock.lock())` as well, if we
> > do:
> >
> > impl UpgradePollCondVar {
> > pub fn wait(...) {
> > prepare_to_wait_exclusive(); // <- this will take lock in
> > // simple.wait_queue_head. So
> > // either upgrade() comes
> > // first, or they observe the
> > // wait being queued.
> > let cv_ptr = self.active.load(Relaxed);
> > if !ptr_eq(cv_ptr, &self.simple) { // We have moved from
> > // simple, so need to
> > // need to wake up and
> > // redo the wait.
> > finish_wait();
> > } else {
> > guard.do_unlock(|| { schedule_timeout(); });
> > finish_wait();
> > }
> > }
> > }
> >
> > (CondVar::notify*() will take the wait_queue_head lock as well)
>
> Yeah but then I have to actually re-implement those methods and not just
> call them. Seems not worth it.
>

We can pass a closure to wait_*() as condition:

fn wait_internal<T: ?Sized, B: Backend>(
&self,
wait_state: c_int,
guard: &mut Guard<'_, T, B>,
cond: Some(FnOnce() -> bool),
timeout_in_jiffies: c_long,
) -> c_long {

I'm not just suggesting this because it helps in this case. In a more
general pattern (if you see ___wait_event() macro in
include/linux/wait.h), the condition checking after prepare_to_wait*()
is needed to prevent wake-up misses. So maybe in long-term, we will have
the case that we need to check the condition for `CondVar` as well.

Plus, you don't need to pass a &Lock to poll() if you do this ;-)

> > > > Do you think the complexity of a dynamic upgrading is worthwhile, or we
> > > > should just use the box-allocated PollCondVar unconditionally?
> > > >
> > > > I think if the current users won't benefit from the dynamic upgrading
> > > > then we can avoid the complexity. We can always add it back later.
> > > > Thoughts?
> > >
> > > I do actually think it's worthwhile to consider:
> > >
> > > I started an Android device running this. It created 3961 instances of
> > > `UpgradePollCondVar` during the hour it ran, but only 5 were upgraded.
> > >
> >
> > That makes sense, thank you for providing the data! But still I think we
> > should be more informative about the performance difference between
> > dynamic upgrading vs. unconditionally box-allocated PollCondVar, because
> > I would assume when a `UpgradePollCondVar` is created, other allocations
> > also happen as well (e.g. when creating a Arc<binder::Thread>), so the
> > extra cost of the allocation may be unnoticeable.
>
> Perf-wise it doesn't matter, but I'm concerned about memory usage.
>

Let's see, we are comparing the memory cost between:

(assuming on a 64-bit system, and LOCKDEP=n)

struct UpgradePollCondVar {
simple: CondVar, // <- 24 bytes (1 spinlock + 2 pointers)
active: Atomic<*const UpgradePollCondVarInner>, // <- 8 bytes.
// but +40 extra
// bytes on the
// heap in the
// worst case.
}

vs

struct BoxedPollCondVar {
active: Box<UpgradePollCondVarInner>, // <- 8 bytes, but +40
// extra bytes on the heap
}

that's extra 16 bytes per binder::Thread, but binder::Thread itself is
more than 100 bytes. Of course it's up to binder whether 16 bytes per
thread is a lot or not, but to me, I would choose the simplicity ;-)

Regards,
Boqun

> Alice