Re: [RFC] scripts: kernel-doc: improve parsing for kernel-doc comments syntax
From: Jonathan Corbet
Date: Thu Apr 15 2021 - 17:29:52 EST
Aditya Srivastava <yashsri421@xxxxxxxxx> writes:
> Currently kernel-doc does not identify some cases of probable kernel
> doc comments, for e.g. pointer used as declaration type for identifier,
> space separated identifier, etc.
>
> Some example of these cases in files can be:
> i)" * journal_t * jbd2_journal_init_dev() - creates and initialises a journal structure"
> in fs/jbd2/journal.c
>
> ii) "* dget, dget_dlock - get a reference to a dentry" in
> include/linux/dcache.h
>
> iii) " * DEFINE_SEQLOCK(sl) - Define a statically allocated seqlock_t"
> in include/linux/seqlock.h
>
> Also improve identification for non-kerneldoc comments. For e.g.,
>
> i) " * The following functions allow us to read data using a swap map"
> in kernel/power/swap.c does follow the kernel-doc like syntax, but the
> content inside does not adheres to the expected format.
>
> Improve parsing by adding support for these probable attempts to write
> kernel-doc comment.
>
> Suggested-by: Jonathan Corbet <corbet@xxxxxxx>
> Link: https://lore.kernel.org/lkml/87mtujktl2.fsf@xxxxxxxxxxxx
> Signed-off-by: Aditya Srivastava <yashsri421@xxxxxxxxx>
> ---
> scripts/kernel-doc | 16 ++++++++++++----
> 1 file changed, 12 insertions(+), 4 deletions(-)
OK, I've applied this, but I have a couple of comments...
> diff --git a/scripts/kernel-doc b/scripts/kernel-doc
> index 888913528185..37665aa41e6b 100755
> --- a/scripts/kernel-doc
> +++ b/scripts/kernel-doc
> @@ -2110,17 +2110,25 @@ sub process_name($$) {
> } elsif (/$doc_decl/o) {
> $identifier = $1;
> my $is_kernel_comment = 0;
> - if (/^\s*\*\s*([\w\s]+?)(\(\))?\s*([-:].*)?$/) {
> + my $decl_start = qr{\s*\*};
I appreciate the attempt to make the regexes a bit more comprehensible,
but we can do better yet, methinks. This $decl_start is very much like
$doc_com defined globally.
It would really help a lot if we could at least take the incredible mass
of regexes in this program and boil them down to a smaller, unique set
that is used throughout. kernel-doc might still make brains explode,
but perhaps the blast radius would be a bit smaller.
> + my $fn_type = qr{\w+\s*\*\s*}; # i.e. pointer declaration type, foo * bar() - desc
Some of the lines in this change go waaaaay beyond the 80-character
limit; please try not to do that. I fixed up the offending comments
this time around.
Thanks,
jon