Re: [PATCH v2 16/19] gendwarfksyms: Add support for reserved structure fields

From: Benno Lossin
Date: Thu Sep 12 2024 - 17:58:47 EST


On 12.09.24 22:58, Sami Tolvanen wrote:
> Hi Benno,
>
> On Thu, Sep 12, 2024 at 11:08 AM Benno Lossin <benno.lossin@xxxxxxxxx> wrote:
>>
>> On 12.09.24 18:06, Sami Tolvanen wrote:
>>>
>>> I thought about this a bit and I wonder if we need a separate
>>> mechanism for that, or is it sufficient to just #define any additional
>>> hidden values you want to add instead of including them in the enum?
>>>
>>> enum e {
>>> A,
>>> B,
>>> #define C (B + 1)
>>> #define D (C + 1)
>>> };
>>>
>>>
>>> Do you see any issues with this approach? I think Clang would complain
>>> about this with -Wassign-enum, but I'm not sure if we even enable that
>>> in the kernel, and as long as you don't overflow the underlying type,
>>> which is a requirement for not breaking the ABI anyway, it should be
>>> fine.
>>
>> Rust has problems with `#define`-style enums, because bindgen (the tool
>> that generates definitions for Rust to be able to call C code) isn't
>> able to convert them to Rust enums.
>>
>> So if you can come up with an approach that allows you to continue to
>> use C enums instead of `#define`, we would appreciate that, since it
>> would make our lives a lot easier.
>
> That's an interesting point. Is the problem that you cannot assign
> arbitrary values to the Rust enum that bindgen generates, or is using
> a #define the problem? We could probably just make the hidden enum
> values visible to bindgen only if needed.

So if I take your example from above add it to our bindgen input, then I
get the following output:

pub const e_A: my_own_test_enum = 0;
pub const e_B: my_own_test_enum = 1;
pub type e_enum = core::ffi::c_uint;

So it doesn't pick up the other constants at all. That is probably
because we haven't enabled the bindgen flag that adds support for
function-like macros. If I enable that flag (`--clang-macro-fallback`,
then the output becomes:

pub const C: u32 = 2;
pub const D: u32 = 3;
pub const e_A: e = 0;
pub const e_B: e = 1;
pub type e = ::std::os::raw::c_uint;

So it doesn't really work as we would like it to (ie missing e_ prefix).

But even if bindgen were to start supporting `#define` inside of the
enum. It might still have a problem with the `#define`: there is the
`--rustified-enum <REGEX>` option for bindgen that would change the
output to this:

#[repr(u32)]
#[derive(Debug, Copy, Clone, Hash, PartialEq, Eq)]
pub enum e {
A = 0,
B = 1,
}

Which makes using the values on the Rust side a lot easier, since you
get exhaustiveness checks when using `match`. Adding the
`--clang-macro-fallback` flag, I get:

pub const C: u32 = 2;
pub const D: u32 = 3;
#[repr(u32)]
#[derive(Debug, Copy, Clone, Hash, PartialEq, Eq)]
pub enum e {
A = 0,
B = 1,
}

Which is a big problem, because the enum `e` won't have 2 or 3 as valid
values (it will be UB to write them to a variable of type `e`).


Would you add conditions to the `#define`? For example checking for the
version of kABI? (or how would it work?)

Because we might want to have something similar on the Rust side then:

#[repr(u32)]
#[derive(Debug, Copy, Clone, Hash, PartialEq, Eq)]
pub enum e {
A = 0,
B = 1,
#[cfg(kabi >= "some-version")]
C = 2,
#[cfg(kabi >= "some-version")]
B = 3,
}

(still generated by bindgen though)

---
Cheers,
Benno