[PATCH 05/17] genksyms: fix last 3 shift/reduce conflicts

From: Masahiro Yamada
Date: Mon Jan 13 2025 - 10:10:45 EST


The genksyms parser has ambiguities in its grammar, which are currently
suppressed by a workaround in scripts/genksyms/Makefile.

Building genksyms with W=1 generates the following warnings:

YACC scripts/genksyms/parse.tab.[ch]
scripts/genksyms/parse.y: warning: 3 shift/reduce conflicts [-Wconflicts-sr]
scripts/genksyms/parse.y: note: rerun with option '-Wcounterexamples' to generate conflict counterexamples

The ambiguity arises when decl_specifier_seq is followed by '(' because
the following two interpretations are possible:

- decl_specifier_seq direct_abstract_declarator '(' parameter_declaration_clause ')'
- decl_specifier_seq '(' abstract_declarator ')'

This issue occurs because the current parser allows an empty string to
be reduced to direct_abstract_declarator, which is incorrect.

K&R [1] explains the correct grammar:

<parameter-declaration> ::= {<declaration-specifier>}+ <declarator>
| {<declaration-specifier>}+ <abstract-declarator>
| {<declaration-specifier>}+

<abstract-declarator> ::= <pointer>
| <pointer> <direct-abstract-declarator>
| <direct-abstract-declarator>

<direct-abstract-declarator> ::= ( <abstract-declarator> )
| {<direct-abstract-declarator>}? [ {<constant-expression>}? ]
| {<direct-abstract-declarator>}? ( {<parameter-type-list>}? )

We need to consider the difference between the following two examples:

[Example 1] ( <abstract-declarator> ) can become <direct-abstract-declarator>

void my_func(int (foo));

... is equivalent to:

void my_func(int foo);

[Example 2] ( <parameter-type-list> ) can become <direct-abstract-declarator>

typedef int foo;
void my_func(int (foo));

... is equivalent to:

void my_func(int (*callback)(int));

Please note that the function declaration is identical in both examples,
but the preceding typedef creates the distinction. I introduced a new
term, open_paren, to enable the type lookup immediately after the '('
token. Without this, we cannot distinguish between [Example 1] and
[Example 2].

With this commit, all conflicts are resolved.

[1]: https://cs.wmich.edu/~gupta/teaching/cs4850/sumII06/The%20syntax%20of%20C%20in%20Backus-Naur%20form.htm

Signed-off-by: Masahiro Yamada <masahiroy@xxxxxxxxxx>
---

scripts/genksyms/parse.y | 28 ++++++++++++++++++++--------
1 file changed, 20 insertions(+), 8 deletions(-)

diff --git a/scripts/genksyms/parse.y b/scripts/genksyms/parse.y
index dc575d467bbf..fafce939c32f 100644
--- a/scripts/genksyms/parse.y
+++ b/scripts/genksyms/parse.y
@@ -363,35 +363,47 @@ parameter_declaration_list:
;

parameter_declaration:
- decl_specifier_seq abstract_declarator
+ decl_specifier_seq abstract_declarator_opt
{ $$ = $2 ? $2 : $1; }
;

+abstract_declarator_opt:
+ /* empty */ { $$ = NULL; }
+ | abstract_declarator
+ ;
+
abstract_declarator:
- ptr_operator abstract_declarator
+ ptr_operator
+ | ptr_operator abstract_declarator
{ $$ = $2 ? $2 : $1; }
| direct_abstract_declarator
{ $$ = $1; dont_want_type_specifier = false; }
;

direct_abstract_declarator:
- /* empty */ { $$ = NULL; }
- | IDENT
+ IDENT
{ /* For version 2 checksums, we don't want to remember
private parameter names. */
remove_node($1);
$$ = $1;
}
- | direct_abstract_declarator '(' parameter_declaration_clause ')'
+ | direct_abstract_declarator open_paren parameter_declaration_clause ')'
{ $$ = $4; }
- | direct_abstract_declarator '(' error ')'
+ | direct_abstract_declarator open_paren error ')'
{ $$ = $4; }
| direct_abstract_declarator BRACKET_PHRASE
{ $$ = $2; }
- | '(' abstract_declarator ')'
+ | open_paren parameter_declaration_clause ')'
{ $$ = $3; }
- | '(' error ')'
+ | open_paren abstract_declarator ')'
{ $$ = $3; }
+ | open_paren error ')'
+ { $$ = $3; }
+ | BRACKET_PHRASE
+ ;
+
+open_paren:
+ '(' { $$ = $1; dont_want_type_specifier = false; }
;

function_definition:
--
2.43.0