Re: [PATCH v2 16/19] gendwarfksyms: Add support for reserved structure fields

From: Benno Lossin
Date: Fri Sep 13 2024 - 04:01:13 EST


On 13.09.24 00:37, Sami Tolvanen wrote:
> Hi,
>
> On Thu, Sep 12, 2024 at 2:58 PM Benno Lossin <benno.lossin@xxxxxxxxx> wrote:
>>
>> On 12.09.24 22:58, Sami Tolvanen wrote:
>>> That's an interesting point. Is the problem that you cannot assign
>>> arbitrary values to the Rust enum that bindgen generates, or is using
>>> a #define the problem? We could probably just make the hidden enum
>>> values visible to bindgen only if needed.
>>
>> So if I take your example from above add it to our bindgen input, then I
>> get the following output:
>>
>> pub const e_A: my_own_test_enum = 0;
>> pub const e_B: my_own_test_enum = 1;
>> pub type e_enum = core::ffi::c_uint;
>>
>> So it doesn't pick up the other constants at all. That is probably
>> because we haven't enabled the bindgen flag that adds support for
>> function-like macros. If I enable that flag (`--clang-macro-fallback`,
>> then the output becomes:
>>
>> pub const C: u32 = 2;
>> pub const D: u32 = 3;
>> pub const e_A: e = 0;
>> pub const e_B: e = 1;
>> pub type e = ::std::os::raw::c_uint;
>>
>> So it doesn't really work as we would like it to (ie missing e_ prefix).
>
> If defines are a problem, we can always use a const int instead. It
> doesn't have to be defined inside the enum either, and probably we can
> add a prefix too.

They might also be a problem, though I haven't checked. It would be best
if they can just stay in the `enum`.

>> But even if bindgen were to start supporting `#define` inside of the
>> enum. It might still have a problem with the `#define`: there is the
>> `--rustified-enum <REGEX>` option for bindgen that would change the
>> output to this:
>>
>> #[repr(u32)]
>> #[derive(Debug, Copy, Clone, Hash, PartialEq, Eq)]
>> pub enum e {
>> A = 0,
>> B = 1,
>> }
>>
>> Which makes using the values on the Rust side a lot easier, since you
>> get exhaustiveness checks when using `match`. Adding the
>> `--clang-macro-fallback` flag, I get:
>>
>> pub const C: u32 = 2;
>> pub const D: u32 = 3;
>> #[repr(u32)]
>> #[derive(Debug, Copy, Clone, Hash, PartialEq, Eq)]
>> pub enum e {
>> A = 0,
>> B = 1,
>> }
>>
>> Which is a big problem, because the enum `e` won't have 2 or 3 as valid
>> values (it will be UB to write them to a variable of type `e`).
>
> Yes, I sort of thought that this might be an issue. I don't see this
> in bindgen flags right now, are you planning on switching the kernel
> bindgen to use --rustified-enum?

You mean you don't see the `--clang-macro-fallback` option? I think it
was added in version 0.70.0.

> If you do plan to use --rustified-enum, we could just use #ifdef
> __BINDGEN__ to hide the fields from everyone else, but I think we
> might actually need a more generic solution after all. I'll think
> about it a bit more.

Well we don't exactly plan to use `--rustified-enum`, the problem is
that transmuting the integer that C gives us to that enum is UB, when
the integer is not a valid bit pattern for that enum. Instead we would
like to have an option to generate both the Rust-style enum and a
newtype enum that can hold any integer value. We then check at runtime
that the value is in range and error otherwise. This is being worked on
at [1].
I would say that it has the same issue that `--rustified-enum` currently
has.

[1]: https://github.com/rust-lang/rust-bindgen/pull/2908

---
Cheers,
Benno