Re: [PATCH net-next v2 5/6] rust: Add read_poll_timeout function

From: Andrew Lunn
Date: Tue Oct 08 2024 - 18:26:33 EST


On Tue, Oct 08, 2024 at 02:53:56PM -0700, Boqun Feng wrote:
> On Tue, Oct 08, 2024 at 07:16:42PM +0200, Andrew Lunn wrote:
> > On Tue, Oct 08, 2024 at 03:14:05PM +0200, Miguel Ojeda wrote:
> > > On Tue, Oct 8, 2024 at 2:13 PM Andrew Lunn <andrew@xxxxxxx> wrote:
> > > >
> > > > As far as i see, might_sleep() will cause UAF where there is going to
> > > > be a UAF anyway. If you are using it correctly, it does not cause UAF.
> > >
> > > This already implies that it is an unsafe function (in general, i.e.
> > > modulo klint, or a way to force the user to have to write `unsafe`
> > > somewhere else, or what I call ASHes -- "acknowledged soundness
> > > holes").
> > >
> > > If we consider as safe functions that, if used correctly, do not cause
> > > UB, then all functions would be safe.
> >
> > From what i hear, klint is still WIP. So we have to accept there will
> > be bad code out there, which will UAF. We want to find such bad code,
>
> If you don't believe in klint

I did not say that. It is WIP, and without it i assume nothing is
detecting at compile time that the code is broken. Hence we need to
find the problem at runtime, which is what might_sleep() is all about.

> might_sleep() is useful because it checks preemption count and task
> state, which is provided by __might_sleep() as well. I don't think
> causing UAF helps we detect atomic context violation faster than what
> __might_sleep() already have. Again, could you provide an example that
> help me understand your reasoning here?

> > while (1) {
> > <reader> <updater>
> > rcu_read_lock();
> > p = rcu_dereference(gp);
> > mutex_lock(&lock)
> > a = READ_ONCE(p->a);
> > mutex_unlock(&lock)
> > rcu_read_unlock();
> > }

The mutex lock is likely to be uncontested, so you don't sleep, and so
don't trigger the UAF. The code is clearly broken, but you survive.
Until the lock is contested, you do sleep, RCU falls apart, resulting
in a UAF.

Now if you used might_sleep(), every time you go around that loop you
do some of the same processing as actually sleeping, so are much more
likely to trigger the UAF.

might_sleep() as you pointed out, is also active when
CONFIG_DEBUG_ATOMIC_SLEEP is false. Thus it is also going to trigger
the broken code to UAF faster. And i expect a lot of testing is done
without CONFIG_DEBUG_ATOMIC_SLEEP and CONFIG_PROVE_LOCKING.

Once klint is completed, and detects all these problems at compile
time, we can then discard all this might_sleep stuff. But until then,
the faster code explodes, the more likely it is going to be quickly
and cheaply fixed.

Andrew