Re: Rust kernel policy
From: James Bottomley
Date: Thu Feb 20 2025 - 11:14:38 EST
On Wed, 2025-02-19 at 17:44 +0100, Miguel Ojeda wrote:
> On Wed, Feb 19, 2025 at 5:03 PM James Bottomley
> <James.Bottomley@xxxxxxxxxxxxxxxxxxxxx> wrote:
[...]
> > This very much depends on how the callers are coded, I think. When
> > I looked at Wedson's ideas on this, the C API contracts were
> > encoded in the headers, so mostly only the headers not the body of
> > the code had to change (so the headers needed updating when the C
> > API contract changed). If the enhanced bindgen produces new headers
> > then code like this will just update without breaking (I admit not
> > all code will work like that, but it's still a useful property).
>
> Hmm... I am not sure exactly what you mean here. Are you referring to
> Wedson's FS slides from LSF/MM/BPF? i.e are you referring to Rust
> signatures?
OK, this is just a terminology difference. I think of bindings as the
glue that sits between two pieces of code trying to interact. In your
terms that's both the abstractions and the bindgen bindings.
> If yes, those signatures are manually written, they are not the
> generated bindings. We typically refer to those as "abstractions", to
> differentiate from the generated stuff.
I understand, but it's the manual generation of the abstractions that's
causing the huge pain when the C API changes because they have to be
updated manually by someone.
> The Rust callers (i.e. the users of those abstractions) definitely do
> not need to change if the C APIs change (unless they change in a
> major way that you need to redesign your Rust abstractions layer, of
> course).
>
> So, for instance, if your C API gains a parameter, then you should
> update all your C callers as usual, plus the Rust abstraction that
> calls C (which could be just a single call). But you don't need to
> update all the Rust modules that call Rust abstractions.
You say that like it's easy ... I think most people who work in the
kernel wouldn't know how to do this.
> In other words, we do not call C directly from Rust modules, in fact,
> we forbid it (modulo exceptional/justified cases). There is a bit
> more on that here, with a diagram:
>
>
> https://docs.kernel.org/rust/general-information.html#abstractions-vs-bindings
>
> In summary, those abstractions give you several things: the ability
> to provide safe APIs for Rust modules (instead of unsafe calls
> everywhere), the ability to write idiomatic Rust in your callers
> (instead of FFI) and the ability to reduce breaks like I think you
> are suggesting.
>
> Now, generating those safe abstractions automatically would be quite
> an achievement, and it would require more than just a few simple
> annotations in the header. Typically, it requires understanding the C
> implementation, and even then, it is hard for a human to do, i.e. we
> are talking about an open problem.
I'm under no illusion that this would be easy, but if there were a way
of having all the information required in the C code in such a way that
something like an extended sparse could check it (so if you got the
annotations wrong you'd notice) and an extended bindgen could generate
both the bindings and the abstractions from it, it would dramatically
reduce the friction the abstractions cause in kernel API updates.
> Perhaps you could approximate it with an AI that you give the C
> implementation, plus the C headers, plus the C headers and
> implementations that those call, and so on, up to some layer. Even
> then, it is a problem that typically has many different valid
> solutions, i.e. you can design your safe Rust API in different ways
> and with different tradeoffs.
>
> I hope that clarifies.
Yes, I think it does, thanks.
Regards,
James