Re: [PATCH v3 01/56] scripts: kernel-doc: fix typedef parsing

From: Mauro Carvalho Chehab
Date: Mon Oct 26 2020 - 01:55:51 EST


Em Fri, 23 Oct 2020 11:22:26 -0600
Jonathan Corbet <corbet@xxxxxxx> escreveu:

> On Fri, 23 Oct 2020 18:32:48 +0200
> Mauro Carvalho Chehab <mchehab+huawei@xxxxxxxxxx> wrote:
>
> > The include/linux/genalloc.h file defined this typedef:
> >
> > typedef unsigned long (*genpool_algo_t)(unsigned long *map,unsigned long size,unsigned long start,unsigned int nr,void *data, struct gen_pool *pool, unsigned long start_addr);
> >
> > Because it has a type composite of two words (unsigned long),
> > the parser gets the typedef name wrong:
> >
> > .. c:macro:: long
> >
> > **Typedef**: Allocation callback function type definition
> >
> > Fix the regex in order to accept composite types when
> > defining a typedef for a function pointer.
> >
> > Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@xxxxxxxxxx>
> > ---
> > scripts/kernel-doc | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/scripts/kernel-doc b/scripts/kernel-doc
> > index 99cd8418ff8a..311d213ee74d 100755
> > --- a/scripts/kernel-doc
> > +++ b/scripts/kernel-doc
> > @@ -1438,7 +1438,7 @@ sub dump_typedef($$) {
> > $x =~ s@/\*.*?\*/@@gos; # strip comments.
> >
> > # Parse function prototypes
> > - if ($x =~ /typedef\s+(\w+)\s*\(\*\s*(\w\S+)\s*\)\s*\((.*)\);/ ||
> > + if ($x =~ /typedef\s+(\w+\s*){1,}\(\*\s*(\w\S+)\s*\)\s*\((.*)\);/ ||
>
> I sure wish we could find a way to make all these regexes more
> understandable and maintainable. Reviewing a change like this is ... fun.
>
> Anyway, it seems to work, but it does now include trailing whitespace in
> the type portion. So, for example, from include/linux/xarray.h:
>
> typedef void (*xa_update_node_t)(struct xa_node *node);
>
> The type is parsed as "void " where it was "void" before. The only ill
> effect I can see is that some non-breaking spaces get inserted into the
> HTML output, but perhaps it's worth stripping off that trailing space
> anyway?

Yeah, this is one of the issues. There's another one, tough. While
the above regex recognizes the typedef identifier, it only gets
the last word of "unsigned long", in the case of something like:

typedef unsigned long (*genpool_algo_t)(unsigned long *map);

Here, we have no option but to use a hidden group, e. g. using
this regex:

typedef\s+((?:\w+\s*){1,})\(\*\s*(\w\S+)\s*\)\s*\((.*)\);

I'm enclosing a second version with the above.

Yeah, reviewing it is even funnier, but regex101 can be used to
double-check what the regex is doing:

https://regex101.com/r/bPTm18/2

Thanks,
Mauro

[PATCH] scripts: kernel-doc: fix typedef parsing

The include/linux/genalloc.h file defined this typedef:

typedef unsigned long (*genpool_algo_t)(unsigned long *map,unsigned long size,unsigned long start,unsigned int nr,void *data, struct gen_pool *pool, unsigned long start_addr);

Because it has a type composite of two words (unsigned long),
the parser gets the typedef name wrong:

.. c:macro:: long

**Typedef**: Allocation callback function type definition

Fix the regex in order to accept composite types when
defining a typedef for a function pointer.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@xxxxxxxxxx>

diff --git a/scripts/kernel-doc b/scripts/kernel-doc
index 99cd8418ff8a..b37f3cf8a331 100755
--- a/scripts/kernel-doc
+++ b/scripts/kernel-doc
@@ -1438,13 +1438,14 @@ sub dump_typedef($$) {
$x =~ s@/\*.*?\*/@@gos; # strip comments.

# Parse function prototypes
- if ($x =~ /typedef\s+(\w+)\s*\(\*\s*(\w\S+)\s*\)\s*\((.*)\);/ ||
+ if ($x =~ /typedef\s+((?:\w+\s*){1,})\(\*\s*(\w\S+)\s*\)\s*\((.*)\);/ ||
$x =~ /typedef\s+(\w+)\s*(\w\S+)\s*\s*\((.*)\);/) {

# Function typedefs
$return_type = $1;
$declaration_name = $2;
my $args = $3;
+ $return_type =~ s/\s+$//;

create_parameterlist($args, ',', $file, $declaration_name);