Re: [PATCH] rust: add `module_params` macro

From: Benno Lossin
Date: Mon Jul 29 2024 - 13:16:59 EST


On 05.07.24 13:15, Andreas Hindborg wrote:
> From: Andreas Hindborg <a.hindborg@xxxxxxxxxxx>
>
> This patch includes changes required for Rust kernel modules to utilize
> module parameters. This code implements read only support for integer
> types without `sysfs` support.
>
> This code is a reduced and updated version of code by Adam available in the
> original `rust` branch [1].
>
> [1] https://github.com/Rust-for-Linux/linux/tree/bc22545f38d74473cfef3e9fd65432733435b79f
>
> Cc: Adam Bratschi-Kaye <ark.email@xxxxxxxxx>
> Signed-off-by: Andreas Hindborg <a.hindborg@xxxxxxxxxxx>
>
> ---
>
> I have patches queued up for `rnull` that depend on this patch.
> ---
> rust/bindings/bindings_helper.h | 1 +
> rust/kernel/lib.rs | 2 +
> rust/kernel/module_param.rs | 339 ++++++++++++++++++++++++++++++++
> rust/macros/helpers.rs | 8 +
> rust/macros/module.rs | 230 ++++++++++++++++++++--
> samples/rust/rust_minimal.rs | 10 +
> 6 files changed, 571 insertions(+), 19 deletions(-)
> create mode 100644 rust/kernel/module_param.rs
>
> diff --git a/rust/bindings/bindings_helper.h b/rust/bindings/bindings_helper.h
> index ddb5644d4fd9..767169f46f10 100644
> --- a/rust/bindings/bindings_helper.h
> +++ b/rust/bindings/bindings_helper.h
> @@ -25,3 +25,4 @@ const gfp_t RUST_CONST_HELPER_GFP_KERNEL = GFP_KERNEL;
> const gfp_t RUST_CONST_HELPER_GFP_KERNEL_ACCOUNT = GFP_KERNEL_ACCOUNT;
> const gfp_t RUST_CONST_HELPER_GFP_NOWAIT = GFP_NOWAIT;
> const gfp_t RUST_CONST_HELPER___GFP_ZERO = __GFP_ZERO;
> +const size_t RUST_CONST_HELPER_PAGE_SIZE = PAGE_SIZE;
> diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs
> index fbd91a48ff8b..7cc2730e572f 100644
> --- a/rust/kernel/lib.rs
> +++ b/rust/kernel/lib.rs
> @@ -33,6 +33,8 @@
> pub mod ioctl;
> #[cfg(CONFIG_KUNIT)]
> pub mod kunit;
> +#[doc(hidden)]
> +pub mod module_param;
> #[cfg(CONFIG_NET)]
> pub mod net;
> pub mod prelude;
> diff --git a/rust/kernel/module_param.rs b/rust/kernel/module_param.rs
> new file mode 100644
> index 000000000000..dc1108d20723
> --- /dev/null
> +++ b/rust/kernel/module_param.rs
> @@ -0,0 +1,339 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +//! Types for module parameters.
> +//!
> +//! C header: [`include/linux/moduleparam.h`](../../../include/linux/moduleparam.h)
> +
> +use crate::{error::code::*, str::CStr};
> +
> +/// Types that can be used for module parameters.
> +///
> +/// Note that displaying the type in `sysfs` will fail if
> +/// [`core::str::from_utf8`] (as implemented through the [`core::fmt::Display`]
> +/// trait) writes more than [`PAGE_SIZE`] bytes (including an additional null
> +/// terminator).
> +///
> +/// [`PAGE_SIZE`]: `bindings::PAGE_SIZE`
> +pub trait ModuleParam: core::fmt::Display + core::marker::Sized {
> + /// The `ModuleParam` will be used by the kernel module through this type.
> + ///
> + /// This may differ from `Self` if, for example, `Self` needs to track
> + /// ownership without exposing it or allocate extra space for other possible
> + /// parameter values. This is required to support string parameters in the
> + /// future.
> + type Value: ?Sized;
> +
> + /// Whether the parameter is allowed to be set without an argument.
> + ///
> + /// Setting this to `true` allows the parameter to be passed without an
> + /// argument (e.g. just `module.param` instead of `module.param=foo`).
> + const NOARG_ALLOWED: bool;

I think, there is a better way of doing this. Instead of this bool, we
do the following:
1. have a `const DEFAULT: Option<Self>`
2. change the type of the argument of `try_from_param_arg` to
`&'static [u8]`

That way we don't have the weird behavior of `try_from_param_arg` that
for params that don't have a default value.

> +
> + /// Convert a parameter argument into the parameter value.
> + ///
> + /// `None` should be returned when parsing of the argument fails.
> + /// `arg == None` indicates that the parameter was passed without an
> + /// argument. If `NOARG_ALLOWED` is set to `false` then `arg` is guaranteed
> + /// to always be `Some(_)`.
> + ///
> + /// Parameters passed at boot time will be set before [`kmalloc`] is
> + /// available (even if the module is loaded at a later time). However, in
> + /// this case, the argument buffer will be valid for the entire lifetime of
> + /// the kernel. So implementations of this method which need to allocate
> + /// should first check that the allocator is available (with
> + /// [`crate::bindings::slab_is_available`]) and when it is not available
> + /// provide an alternative implementation which doesn't allocate. In cases
> + /// where the allocator is not available it is safe to save references to
> + /// `arg` in `Self`, but in other cases a copy should be made.
> + ///
> + /// [`kmalloc`]: ../../../include/linux/slab.h
> + fn try_from_param_arg(arg: Option<&'static [u8]>) -> Option<Self>;

Do we want this to return `Result` instead?
Or rather, why is this returning `Option`?

> +
> + /// Get the current value of the parameter for use in the kernel module.
> + ///
> + /// This function should not be used directly. Instead use the wrapper
> + /// `read` which will be generated by [`macros::module`].
> + fn value(&self) -> &Self::Value;
> +
> + /// Set the module parameter from a string.
> + ///
> + /// Used to set the parameter value at kernel initialization, when loading
> + /// the module or when set through `sysfs`.
> + ///
> + /// `param.arg` is a pointer to `*mut Self` as set up by the [`module!`]
> + /// macro.
> + ///
> + /// See `struct kernel_param_ops.set`.
> + ///
> + /// # Safety
> + ///
> + /// If `val` is non-null then it must point to a valid null-terminated
> + /// string. The `arg` field of `param` must be an instance of `Self`.
> + ///
> + /// # Invariants
> + ///
> + /// Currently, we only support read-only parameters that are not readable
> + /// from `sysfs`. Thus, this function is only called at kernel
> + /// initialization time, or at module load time, and we have exclusive
> + /// access to the parameter for the duration of the function.
> + ///
> + /// [`module!`]: macros::module
> + unsafe extern "C" fn set_param(
> + val: *const core::ffi::c_char,
> + param: *const crate::bindings::kernel_param,
> + ) -> core::ffi::c_int {
> + let arg = if val.is_null() {
> + None
> + } else {
> + // SAFETY: By function safety requirement, val is non-null and
> + // null-terminated. By C API contract, `val` is live and valid for
> + // reads for the duration of this function.
> + Some(unsafe { CStr::from_char_ptr(val).as_bytes() })
> + };
> + match Self::try_from_param_arg(arg) {
> + Some(new_value) => {
> + // SAFETY: `param` is guaranteed to be valid by C API contract
> + // and `arg` is guaranteed to point to an instance of `Self`.
> + let old_value = unsafe { (*param).__bindgen_anon_1.arg as *mut Self };
> + // SAFETY: `old_value` is valid for writes, as we have exclusive
> + // access. `old_value` is pointing to an initialized static, an
> + // so it is properly initialized.
> + let _ = unsafe { core::ptr::replace(old_value, new_value) };
> + 0
> + }
> + None => EINVAL.to_errno(),
> + }
> + }

Do implementers need to override this function? If not, then I would
move it out of the trait.

> +
> + /// Write a string representation of the current parameter value to `buf`.
> + ///
> + /// # Safety
> + ///
> + /// Must not be called.
> + ///
> + /// # Note
> + ///
> + /// This should not be called as we declare all parameters as read only.
> + unsafe extern "C" fn get_param(
> + _buf: *mut core::ffi::c_char,
> + _param: *const crate::bindings::kernel_param,
> + ) -> core::ffi::c_int {
> + 0
> + }

Ditto.

> +
> + /// Drop the parameter.
> + ///
> + /// Called when unloading a module.
> + ///
> + /// # Safety
> + ///
> + /// The `arg` field of `param` must be an initialized instance of `Self`.
> + unsafe extern "C" fn free(arg: *mut core::ffi::c_void) {
> + // SAFETY: By function safety requirement, `arg` is an initialized
> + // instance of `Self`. By C API contract, `arg` will not be used after
> + // this function returns.
> + unsafe { core::ptr::drop_in_place(arg as *mut Self) };
> + }

Ditto.

> +}

[...]

> diff --git a/rust/macros/helpers.rs b/rust/macros/helpers.rs
> index 563dcd2b7ace..dc0b47879a8c 100644
> --- a/rust/macros/helpers.rs
> +++ b/rust/macros/helpers.rs
> @@ -107,6 +107,14 @@ pub(crate) struct Generics {
> pub(crate) ty_generics: Vec<TokenTree>,
> }
>
> +pub(crate) fn get_string(it: &mut token_stream::IntoIter, expected_name: &str) -> String {

This name is rather weird, `get_field` makes more sense IMO.

> + assert_eq!(expect_ident(it), expected_name);
> + assert_eq!(expect_punct(it), ':');
> + let string = expect_string(it);
> + assert_eq!(expect_punct(it), ',');

Why do we require a trailing comma?

> + string
> +}
> +
> /// Parses the given `TokenStream` into `Generics` and the rest.
> ///
> /// The generics are not present in the rest, but a where clause might remain.
> diff --git a/rust/macros/module.rs b/rust/macros/module.rs
> index acd0393b5095..fc794855b99e 100644
> --- a/rust/macros/module.rs
> +++ b/rust/macros/module.rs
> @@ -1,7 +1,7 @@
> // SPDX-License-Identifier: GPL-2.0
>
> use crate::helpers::*;
> -use proc_macro::{token_stream, Delimiter, Literal, TokenStream, TokenTree};
> +use proc_macro::{token_stream, Delimiter, Group, Literal, TokenStream, TokenTree};
> use std::fmt::Write;
>
> fn expect_string_array(it: &mut token_stream::IntoIter) -> Vec<String> {
> @@ -26,6 +26,7 @@ struct ModInfoBuilder<'a> {
> module: &'a str,
> counter: usize,
> buffer: String,
> + param_buffer: String,
> }
>
> impl<'a> ModInfoBuilder<'a> {
> @@ -34,10 +35,11 @@ fn new(module: &'a str) -> Self {
> module,
> counter: 0,
> buffer: String::new(),
> + param_buffer: String::new(),
> }
> }
>
> - fn emit_base(&mut self, field: &str, content: &str, builtin: bool) {
> + fn emit_base(&mut self, field: &str, content: &str, builtin: bool, param: bool) {
> let string = if builtin {
> // Built-in modules prefix their modinfo strings by `module.`.
> format!(
> @@ -51,8 +53,14 @@ fn emit_base(&mut self, field: &str, content: &str, builtin: bool) {
> format!("{field}={content}\0", field = field, content = content)
> };
>
> + let buffer = if param {
> + &mut self.param_buffer
> + } else {
> + &mut self.buffer
> + };
> +
> write!(
> - &mut self.buffer,
> + buffer,
> "
> {cfg}
> #[doc(hidden)]
> @@ -75,20 +83,190 @@ fn emit_base(&mut self, field: &str, content: &str, builtin: bool) {
> self.counter += 1;
> }
>
> - fn emit_only_builtin(&mut self, field: &str, content: &str) {
> - self.emit_base(field, content, true)
> + fn emit_only_builtin(&mut self, field: &str, content: &str, param: bool) {
> + self.emit_base(field, content, true, param)
> }
>
> - fn emit_only_loadable(&mut self, field: &str, content: &str) {
> - self.emit_base(field, content, false)
> + fn emit_only_loadable(&mut self, field: &str, content: &str, param: bool) {
> + self.emit_base(field, content, false, param)
> }
>
> fn emit(&mut self, field: &str, content: &str) {
> - self.emit_only_builtin(field, content);
> - self.emit_only_loadable(field, content);
> + self.emit_internal(field, content, false);
> + }
> +
> + fn emit_internal(&mut self, field: &str, content: &str, param: bool) {
> + self.emit_only_builtin(field, content, param);
> + self.emit_only_loadable(field, content, param);
> + }
> +
> + fn emit_param(&mut self, field: &str, param: &str, content: &str) {
> + let content = format!("{param}:{content}", param = param, content = content);
> + self.emit_internal(field, &content, true);
> + }
> +
> + fn emit_params(&mut self, info: &ModuleInfo) {

This parses the parameters, but shouldn't that happen in
`ModuleInfo::parse`? I agree that it should be a different function, but
I don't think that we should store the params as a `Group` and fail when
we emit the params.

> + if let Some(params) = &info.params {
> + assert_eq!(params.delimiter(), Delimiter::Brace);
> +
> + let mut it = params.stream().into_iter();
> +
> + loop {
> + let param_name = match it.next() {
> + Some(TokenTree::Ident(ident)) => ident.to_string(),
> + Some(_) => panic!("Expected Ident or end"),
> + None => break,
> + };
> +
> + assert_eq!(expect_punct(&mut it), ':');
> + let param_type = expect_ident(&mut it);
> + let group = expect_group(&mut it);
> + assert_eq!(expect_punct(&mut it), ',');

Why do we require a trailing comma?

> +
> + assert_eq!(group.delimiter(), Delimiter::Brace);

This check will should be immediately after expecting the group.

> +
> + let mut param_it = group.stream().into_iter();
> + let param_default = get_param_default(&mut param_it);

I don't understand why this is put into its own (5 LOC) function and
other parts of this function are not. If you can separate parsing from
emitting better, this will probably improve.

> + let param_description = get_string(&mut param_it, "description");
> + expect_end(&mut param_it);
> +
> + let (param_kernel_type, ops): (String, _) = (

Why do you need to specify the type like this?

> + param_type.to_string(),
> + param_ops_path(&param_type).to_string(),
> + );
> +
> + self.emit_param("parmtype", &param_name, &param_kernel_type);

Is the spelling intentional? "parmtype"?

> + self.emit_param("parm", &param_name, &param_description);
> + let param_type_internal = param_type.clone();
> +
> + let read_func = format!(
> + "
> + pub(crate) fn read(&self)
> + -> &<{param_type_internal}
> + as kernel::module_param::ModuleParam>::Value {{

Please add a `::` in front of `kernel::module_param::ModuleParam`. There
are more instances below.

> + // Note: when we enable r/w parameters, we need to lock here.
> +
> + // SAFETY: Parameters do not need to be locked because they are
> + // read only or sysfs is not enabled.
> + unsafe {{
> + <{param_type_internal} as kernel::module_param::ModuleParam>::value(
> + &__{name}_{param_name}_value
> + )
> + }}
> + }}
> + ",
> + name = info.name,
> + param_name = param_name,
> + param_type_internal = param_type_internal,
> + );
> +
> + let kparam = format!(
> + "
> + kernel::bindings::kernel_param__bindgen_ty_1 {{
> + // SAFETY: Access through the resulting pointer is
> + // serialized by C side and only happens before module
> + // `init` or after module `drop` is called.
> + arg: unsafe {{ &__{name}_{param_name}_value }}
> + as *const _ as *mut core::ffi::c_void,

Here you should use `addr_of[_mut]!` instead of taking a reference. Also
will this pointer be used to write to the static, in that case you need
`_mut!`.

> + }},
> + ",
> + name = info.name,
> + param_name = param_name,
> + );

What is the reason for putting `kparam` and `read_func` outside of the
`write!` below? I think it would be easier to read if they are inlined.

> + write!(
> + self.param_buffer,
> + "
> + static mut __{name}_{param_name}_value: {param_type_internal} = {param_default};
> +
> + pub(crate) struct __{name}_{param_name};
> +
> + impl __{name}_{param_name} {{ {read_func} }}
> +
> + pub(crate) const {param_name}: __{name}_{param_name} = __{name}_{param_name};

Why do we need a unit struct as a constant? I think it would make more
sense to have a unit struct/empty enum as the type and the `read`
function be without a receiver.

> +
> + // Note: the C macro that generates the static structs for the `__param` section
> + // asks for them to be `aligned(sizeof(void *))`. However, that was put in place
> + // in 2003 in commit 38d5b085d2a0 (\"[PATCH] Fix over-alignment problem on x86-64\")
> + // to undo GCC over-alignment of static structs of >32 bytes. It seems that is
> + // not the case anymore, so we simplify to a transparent representation here
> + // in the expectation that it is not needed anymore.
> + // TODO: Revisit this to confirm the above comment and remove it if it happened.

Should this TODO be fixed before this is merged? Or do you intend for it
to stay?
If this is indeed correct, should this also be changed in the C side (of
course a different patch)?

> + /// Newtype to make `bindings::kernel_param` `Sync`.
> + #[repr(transparent)]
> + struct __{name}_{param_name}_RacyKernelParam(kernel::bindings::kernel_param);
> +
> + // SAFETY: C kernel handles serializing access to this type. We
> + // never access from Rust module.
> + unsafe impl Sync for __{name}_{param_name}_RacyKernelParam {{
> + }}

Any reason to put the `}` on the next line?

> +
> + #[cfg(not(MODULE))]
> + const __{name}_{param_name}_name: *const core::ffi::c_char =
> + b\"{name}.{param_name}\\0\" as *const _ as *const core::ffi::c_char;
> +
> + #[cfg(MODULE)]
> + const __{name}_{param_name}_name: *const core::ffi::c_char =
> + b\"{param_name}\\0\" as *const _ as *const core::ffi::c_char;
> +
> + #[link_section = \"__param\"]
> + #[used]
> + static __{name}_{param_name}_struct: __{name}_{param_name}_RacyKernelParam =
> + __{name}_{param_name}_RacyKernelParam(kernel::bindings::kernel_param {{
> + name: __{name}_{param_name}_name,
> + // SAFETY: `__this_module` is constructed by the kernel at load time
> + // and will not be freed until the module is unloaded.
> + #[cfg(MODULE)]
> + mod_: unsafe {{ &kernel::bindings::__this_module as *const _ as *mut _ }},
> + #[cfg(not(MODULE))]
> + mod_: core::ptr::null_mut(),
> + // SAFETY: This static is actually constant as seen by
> + // module code. But we need a unique address for it, so it
> + // must be static.

This safety comment makes no sense, should it be a normal comment?

> + ops: unsafe {{ &{ops} }} as *const kernel::bindings::kernel_param_ops,

Why is this `unsafe` block needed, the `make_param_ops` macro declares a
non-mut static.

> + perm: 0, // Will not appear in sysfs
> + level: -1,

Why this value?

> + flags: 0,
> + __bindgen_anon_1: {kparam}
> + }});
> + ",
> + name = info.name,
> + param_type_internal = param_type_internal,
> + read_func = read_func,
> + param_default = param_default,
> + param_name = param_name,
> + ops = ops,
> + kparam = kparam,
> + )
> + .unwrap();
> + }
> + }
> }
> }
>
> +fn param_ops_path(param_type: &str) -> &'static str {
> + match param_type {
> + "i8" => "::kernel::module_param::PARAM_OPS_I8",
> + "u8" => "::kernel::module_param::PARAM_OPS_U8",
> + "i16" => "::kernel::module_param::PARAM_OPS_I16",
> + "u16" => "::kernel::module_param::PARAM_OPS_U16",
> + "i32" => "::kernel::module_param::PARAM_OPS_I32",
> + "u32" => "::kernel::module_param::PARAM_OPS_U32",
> + "i64" => "::kernel::module_param::PARAM_OPS_I64",
> + "u64" => "::kernel::module_param::PARAM_OPS_U64",
> + "isize" => "::kernel::module_param::PARAM_OPS_ISIZE",
> + "usize" => "::kernel::module_param::PARAM_OPS_USIZE",
> + t => panic!("Unrecognized type {}", t),

I would write "Unsupported parameter type `{}`!".

> + }
> +}
> +
> +fn get_param_default(param_it: &mut token_stream::IntoIter) -> String {
> + assert_eq!(expect_ident(param_it), "default");
> + assert_eq!(expect_punct(param_it), ':');
> + let default = try_literal(param_it).expect("Expected default param value");
> + assert_eq!(expect_punct(param_it), ',');
> + default
> +}
> +

Could you also add documentation in rust/macros/lib.rs on the `module!`
macro how parameters can be declared/used? I took a look and it seems
that there is out-of-date docs there also.

---
Cheers,
Benno