Re: [PATCH net-next v2 5/6] rust: Add read_poll_timeout function
From: Boqun Feng
Date: Mon Oct 07 2024 - 10:08:03 EST
On Mon, Oct 07, 2024 at 03:48:09PM +0200, Andrew Lunn wrote:
> On Mon, Oct 07, 2024 at 05:28:28AM -0700, Boqun Feng wrote:
> > On Sun, Oct 06, 2024 at 04:45:21PM +0200, Andrew Lunn wrote:
> > [...]
> > > > > > + if sleep {
> > > > > > + // SAFETY: FFI call.
> > > > > > + unsafe { bindings::might_sleep() }
> > > > > > + }
> > > > >
> > > > > What is actually unsafe about might_sleep()? It is a void foo(void)
> > > >
> > > > Every extern "C" function is by default unsafe, because C doesn't have
> > > > the concept of safe/unsafe. If you want to avoid unsafe, you could
> > > > introduce a Rust's might_sleep() which calls into
> > > > `bindings::might_sleep()`:
> > > >
> > > > pub fn might_sleep() {
> > > > // SAFETY: ??
> > > > unsafe { bindings::might_sleep() }
> > > > }
> > > >
> > > > however, if you call a might_sleep() in a preemption disabled context
> > > > when CONFIG_DEBUG_ATOMIC_SLEEP=n and PREEMPT=VOLUNTERY, it could means
> > > > an unexpected RCU quiescent state, which results an early RCU grace
> > > > period, and that may mean a use-after-free. So it's not that safe as you
> > > > may expected.
> > >
> > > If you call might_sleep() in a preemption disabled context you code is
> > > already unsafe, since that is the whole point of it, to find bugs
> >
> > Well, in Rust, the rule is: any type-checked (compiled successfully)
> > code that only calls safe Rust functions cannot be unsafe. So the fact
> > that calling might_sleep() in a preemption disabled context is unsafe
> > means that something has to be unsafe.
> >
> > This eventually can turn into a "blaming game" in the design space: we
> > can either design the preemption disable function as unsafe or the
> > might_sleep() function as unsafe. But one of them has to be unsafe
> > function, otherwise we are breaking the safe code guarantee.
>
> Just keep in mind, it could of been C which put you into atomic
> context before calling into Rust. An interrupt handler would be a good
> example, and i'm sure there are others.
>
That's why the klint approach is preferred right now. Without klint, and
if we don't want to mark might_sleep() as unsafe, we probably need to
mark the registration of an interrupt handler unsafe, and the safety
requirement would be "making sure the handler doesn't call schedule()".
> > However, this is actually a special case: currently we want to use klint
> > [1] to detect all context mis-matches at compile time. So the above rule
> > extends for kernel: any type-checked *and klint-checked* code that only
> > calls safe Rust functions cannot be unsafe. I.e. we add additional
> > compile time checking for unsafe code. So if might_sleep() has the
> > proper klint annotation, and we actually enable klint for kernel code,
> > then we can make it safe (along with preemption disable functions being
> > safe).
> >
> > > where you use a sleeping function in atomic context. Depending on why
> > > you are in atomic context, it might appear to work, until it does not
> > > actually work, and bad things happen. So it is not might_sleep() which
> > > is unsafe, it is the Rust code calling it.
> >
> > The whole point of unsafe functions is that calling it may result into
> > unsafe code, so that's why all extern "C" functions are unsafe, so are
> > might_sleep() (without klint in the picture).
>
> There is a psychological part to this. might_sleep() is a good debug
> tool, which costs very little in normal builds, but finds logic bugs
> when enabled in debug builds. What we don't want is Rust developers
> not scattering it though their code because it adds unsafe code, and
> the aim is not to have any unsafe code.
>
Sure, but my point is these need to be put together into a proper
design. For example, spin_lock() is currently exposed into Rust as a
safe API lock(), so the following code is unsafe:
let g = lock1.lock(); // lock1 is a spinlock
might_sleep();
drop(g);
without the klint rule, if we want to mark might_sleep() as safe, then
we need to mark lock() as unsafe, otherwise, it's an unsafe code block
constructed by pure safe functions. However, compared to might_sleep(),
I think we would like keep lock() as safe since it is used more widely.
Regards,
Boqun
> Andrew