Re: [Intel-wired-lan] [PATCH 00/38] docs: several improvements to kernel-doc

From: Mauro Carvalho Chehab

Date: Tue Mar 03 2026 - 11:15:50 EST


On Tue, 3 Mar 2026 15:12:30 +0000
"Loktionov, Aleksandr" <aleksandr.loktionov@xxxxxxxxx> wrote:

> > -----Original Message-----
> > From: Mauro Carvalho Chehab <mchehab+huawei@xxxxxxxxxx>
> > Sent: Tuesday, March 3, 2026 3:53 PM
> > To: Jani Nikula <jani.nikula@xxxxxxxxxxxxxxx>
> > Cc: Lobakin, Aleksander <aleksander.lobakin@xxxxxxxxx>; Jonathan
> > Corbet <corbet@xxxxxxx>; Kees Cook <kees@xxxxxxxxxx>; Mauro Carvalho
> > Chehab <mchehab@xxxxxxxxxx>; intel-wired-lan@xxxxxxxxxxxxxxxx; linux-
> > doc@xxxxxxxxxxxxxxx; linux-hardening@xxxxxxxxxxxxxxx; linux-
> > kernel@xxxxxxxxxxxxxxx; netdev@xxxxxxxxxxxxxxx; Gustavo A. R. Silva
> > <gustavoars@xxxxxxxxxx>; Loktionov, Aleksandr
> > <aleksandr.loktionov@xxxxxxxxx>; Randy Dunlap <rdunlap@xxxxxxxxxxxxx>;
> > Shuah Khan <skhan@xxxxxxxxxxxxxxxxxxx>
> > Subject: Re: [PATCH 00/38] docs: several improvements to kernel-doc
> >
> > On Mon, 23 Feb 2026 15:47:00 +0200
> > Jani Nikula <jani.nikula@xxxxxxxxxxxxxxx> wrote:
> >
> > > There's always the question, if you're putting a lot of effort into
> > > making kernel-doc closer to an actual C parser, why not put all that
> > > effort into using and adapting to, you know, an actual C parser?
> >
> > Playing with this idea, it is not that hard to write an actual C
> > parser - or at least a tokenizer. There is already an example of it
> > at:
> >
> > https://docs.python.org/3/library/re.html
> >
> > I did a quick implementation, and it seems to be able to do its job:

...

>
> As hobby C compiler writer, I must say that you need to implement C preprocessor first, because C preprocessor influences/changes the syntax.
> In your tokenizer I see right away that any line which begins from '#' must be just as C preprocessor command without further tokenizing.

Yeah, we may need to implement C preprocessor parser in the future,
but this will require handling #include, with could be somewhat
complex. It is also tricky to handle conditional preprocessor macros,
as kernel-doc would either require a file with at least some defines
or would have to guess how to evaluate it to produce the right
documentation, as ifdefs interfere at C macros.

For now, I want to solve some specific problems:

- fix trim_private_members() function that it is meant to handle
/* private: */ and /* public: */ comments, as it currently have
bugs when used on nested structs/unions, related to where the
"private" scope finishes;

- properly parse nested struct/union and properly pick nested
identifiers;

- detect and replace function arguments when macros with multiple
arguments are used at the same prototype.

Plus, kernel-doc has already a table of transforms to "convert"
the C preprocessor macros that affect documentation into something
that will work.

So, I'm considering to start simple, for now ignoring cpp, addressing
the existing issues.

> But the real pain make C preprocessor substitutions IMHO

Agreed. For now, we're using a transforms list inside kernel-doc for
such purpose. So, those macros are manually "evaluated" there, like:

(KernRe(r'DEFINE_DMA_UNMAP_ADDR\s*\(' + struct_args_pattern + r'\)', re.S), r'dma_addr_t \1'),

This works fine on trivial cases, where the argument is just an ID,
but there are cases were we use macros like here:

struct page_pool_params {
struct_group_tagged(page_pool_params_fast, fast,
unsigned int order;
unsigned int pool_size;
int nid;
struct device *dev;
struct napi_struct *napi;
enum dma_data_direction dma_dir;
unsigned int max_len;
unsigned int offset;
);
struct_group_tagged(page_pool_params_slow, slow,
struct net_device *netdev;
unsigned int queue_idx;
unsigned int flags;
/* private: used by test code only */
void (*init_callback)(netmem_ref netmem, void *arg);
void *init_arg;
);
};

To handle it, I'm thinking on using something like this(*):

CFunction('struct_group_tagged'), r'struct \1 { \3 } \2;')

E.g. teaching kernel-doc that, when:

struct_group_tagged(a, b, c)

is used, it should convert it into:

struct a { c } b;

which is basically what this macro does. On other words, hardcoding
kernel-doc with some rules to handle the cases where CPP macros
need to be evaluated. As there aren't much cases where such macros affect
documentation (on lots of cases, just drop macros are enough), such
approach kinda works.

(*) I wrote already a patch for it, but as Jani pointed, perhaps
using a tokenizer will make the logic simpler and easier to
be understood/maintained.

--
Thanks,
Mauro