Re: [Intel-wired-lan] [PATCH 00/38] docs: several improvements to kernel-doc

From: Mauro Carvalho Chehab

Date: Wed Mar 04 2026 - 07:20:27 EST

On Wed, 04 Mar 2026 12:07:45 +0200
Jani Nikula <jani.nikula@xxxxxxxxxxxxxxx> wrote:

> On Mon, 23 Feb 2026, Jonathan Corbet <corbet@xxxxxxx> wrote:
> > Jani Nikula <jani.nikula@xxxxxxxxxxxxxxx> writes:
> >
> >> There's always the question, if you're putting a lot of effort into
> >> making kernel-doc closer to an actual C parser, why not put all that
> >> effort into using and adapting to, you know, an actual C parser?
> >
> > Not speaking to the current effort but ... in the past, when I have
> > contemplated this (using, say, tree-sitter), the real problem is that
> > those parsers simply strip out the comments. Kerneldoc without comments
> > ... doesn't work very well. If there were a parser without those
> > problems, and which could be made to do the right thing with all of our
> > weird macro usage, it would certainly be worth considering.
>
> I think e.g. libclang and its Python bindings can be made to work. The
> main problems with that are passing proper compiler options (because
> it'll need to include stuff to know about types etc. because it is a
> proper parser), preprocessing everything is going to take time, you need
> to invest a bunch into it to know how slow exactly compared to the
> current thing and whether it's prohitive, and it introduces an extra
> dependency.

It is not just that. Assume we're parsing something like this:

static __always_inline int _raw_read_trylock(rwlock_t *lock)
__cond_acquires_shared(true, lock);

using a cpp (or libclang). We would need to define/undefine 3 symbols:

#if defined(WARN_CONTEXT_ANALYSIS) && !defined(__CHECKER__) && !defined(__GENKSYMS__)

(in this particular case, the default is OK, but on others, it may not
be)

This is by far more complex than just writing a logic that would
convert the above into:

static int _raw_read_trylock(rwlock_t *lock);

which is the current kernel-doc approach.

-

Using a C preprocessor, we might have a very big prototype - and even have
arch-specific defines affecting it, as some includes may be inside
arch/*/include.

So, we would need a kernel-doc ".config" file with a set of defines
that can be hard to maintain.

> So yeah, there are definitely tradeoffs there. But it's not like this
> constant patching of kernel-doc is exactly burden free either. I don't
> know, is it just me, but I'd like to think as a profession we'd be past
> writing ad hoc C parsers by now.

I'd say that the binding logic and the ".config" kernel-doc defines will
be complex to maintain. Maybe more complex than kernel-doc patching and
a simple C parser, like the one on my test.

> > On Mon, 23 Feb 2026 15:47:00 +0200
> > Jani Nikula <jani.nikula@xxxxxxxxxxxxxxx> wrote:
> >> There's always the question, if you're putting a lot of effort into
> >> making kernel-doc closer to an actual C parser, why not put all that
> >> effort into using and adapting to, you know, an actual C parser?
> >
> > Playing with this idea, it is not that hard to write an actual C
> > parser - or at least a tokenizer.
>
> Just for the record, I suggested using an existing parser, not going all
> NIH and writing your own.

I know, but I suspect that a simple tokenizer similar to my example might
do the job without any major impact, but yeah, tests are needed.

--
Thanks,
Mauro