Re: Rust kernel policy
From: comex
Date: Sun Feb 23 2025 - 18:32:18 EST
> On Feb 22, 2025, at 3:42 PM, Piotr Masłowski <piotr@xxxxxxxxxxxxx> wrote:
>
> I'm sure you already know this, but the idea of safety in Rust isn't
> just about making elementary language constructs safe. Rather, it is
> primarily about designing types and code in such a way one can't "use
> them wrong”.
And importantly, it’s very hard to replicate this approach in C, even in a hypothetical ‘C + borrow checker’, because C has no generic types. Not all abstractions need generics, but many do.
Rust has Option<T>. C has null, and you manually track which pointers can be null.
Rust has Result<T, E>. Kernel C has ERR_PTR, and you manually track which pointers can be errors.
Rust has Arc<T> and Box<T> and &T and &mut T to represent different kinds of ownership. C has two pointer types, T * and const T *, and you manually track ownership.
Rust has Vec<T> and &[T] to represent arrays with dynamic length. C has pointers, and you manually keep the pointer and length together.
Rust has Mutex<T> (a mutex along with a mutex-protected value of type T), and MutexGuard<T> (an object representing the fact that a mutex is currently locked). C has plain mutexes, and you manually track which mutexes protect what data, along with which mutexes are currently locked.
Each of these abstractions is simple enough that it *could* be bolted onto C as its own special case. Clang has tried for many. In place of Option<T>, Clang added _Nullable and _Nonnull annotations to pointer types. In place of Arc<T>/Box<T>, Clang added ownership attributes [1]. In place of &[T], Clang added __counted_by / bounds-safety mode [2]. In place of Mutex<T>, Clang added a whole host of mutex-tracking attributes [3].
But needing a separate (and nonstandard) compiler feature for every abstraction you want to make really cuts down on flexibility. Compare Rust for Linux, which not only uses all of that basic vocabulary (with the ability to make Linux-specific customizations as needed), but also defines dozens of custom generic types [4] as safe wrappers around specific Linux APIs, forming abstractions that are too codebase-specific to bake into a compiler at all.
This creates an expressiveness gap between C and Rust that cannot be bridged by safety attributes. Less expressiveness means more need for runtime enforcement, which means more overhead. That is one of the fundamental problems that will face any attempt to implement ‘safe C’.
(A good comparison is Clang’s upcoming bounds-safety feature. It’s the most impressive iteration of ’safe C’ I’ve seen so far. But unlike Rust, it only protects against indexing out of bounds, not against use-after-frees or bad casts. A C extension protecting against those would have to be a lot more invasive. In particular, focusing on spatial safety dodges many of the cases where generics are most important in Rust. But even then, bounds-safety mode requires lots of annotations in order to bring overhead down to acceptable levels.)
[1] https://clang.llvm.org/docs/AttributeReference.html#ownership-holds-ownership-returns-ownership-takes-clang-static-analyzer
[2] https://clang.llvm.org/docs/BoundsSafety.html
[3] https://clang.llvm.org/docs/ThreadSafetyAnalysis.html
[4] https://github.com/search?q=repo%3Atorvalds%2Flinux+%2F%28%3F-i%29struct+%5B%5E+%5C%28%5D*%3C.*%5BA-Z%5D.*%3E%2F+language%3ARust&type=code (requires GitHub login, sorry)