Re: [PATCH 00/11] pragma once: treewide conversion

From: Alexey Dobriyan
Date: Sun Feb 28 2021 - 14:35:49 EST


On Sun, Feb 28, 2021 at 09:46:17AM -0800, Linus Torvalds wrote:
> On Sun, Feb 28, 2021 at 8:57 AM Alexey Dobriyan <adobriyan@xxxxxxxxx> wrote:
> >
> > This is bulk deletion of preprocessor include guards and conversion
> > to #pragma once directive.
>
> So as mentioned earlier, I'm not 100% convinced about the advantage of
> #pragma once.
>
> But I decided to actually test it, and it turns out that it causes
> problems for at least sparse.

Oh no.

> Sparse *does* support pragma once, but it does it purely based on
> pathname equality.

Doing what gcc or clang does seems like a smart thing to do.

> So a simple test-program like this:
>
> File 'pragma.h':
>
> #pragma once
> #include "header.h"
>
> works fine. But this doesn't work at all:
>
> #pragma once
> #include "./header.h"
>
> because it causes the filename to be different every time, and you
> eventually end up with trying to open "././....../pragma.h" and it
> causes ENAMETOOLONG.
>
> So at least sparse isn't ready for this.
>
> I guess sparse could always simplify the name, but that's non-trivial.
>
> And honestly, using st_dev/st_ino is problematic too, since
>
> (a) they can easily be re-used for generated files
>
> (b) you'd have to actually open/fstat the filename to use it, which
> obviates one of the optimizations

fstat is more or less necessary anyway to allocate just enough memory
for 1 read. fstat is not a problem, read is (and subsequent parsing).

> Trying the same on gcc, you don't get that endless "add "./" behavior"
> that sparse did, but a quick test shows that it actually opens the
> file and reads it three times: once for "pramga.h", once for
> "./pragma.h" and a third time for "pragma.h". It only seems to
> _expand_ it once, though.
>
> I have no idea what gcc does. Maybe it does some "different name, so
> let's open and read it, and then does st_dev/st_ino again". But if so,
> why the _third_ time? Is it some guard against "st_ino might have been
> re-used, so I'll open the original name and re-verify"?
>
> End result: #pragma is fundamentally less reliable than the
> traditional #ifdef guard. The #ifdef guard works fine even if you
> re-read the file for whatever reason, while #pragma relies on some
> kind of magical behavior.
>
> I'm adding Luc in case he has any ideas of what the magical behavior might be.

gcc does

open "/" + "whatever between quotes"
fstat

so that "1.h" and "./1.h" differ

https://github.com/gcc-mirror/gcc/blob/master/libcpp/files.c#L377

clang does better:

"./" + "whatever between quotes"
open
fstat
normalise pathname via readlink /proc/*/fd

I think it is quite hard to break something with double inclusion
without trying to actually break stuff. Macros has to be token
for token identical or compiler warn. Types definition too.
Function prototypes and so on.

This is how I found half of the exception list.

The "no leading ./ in includes is trivially enforced with checkpatch.pl
or even grep! And it will optimise the build now that gcc behaviour has
been uncovered.

Include guards aren't without problems.

We have at least 1 identical include guard in the tree for two
completely unrelated headers (allmodconfig of some fringe arch found it)
Nobody complains because only defconfigs are run apparently.

Developer can typo it disabling double inclusion.

#ifndef FOO_H
#define FOO_h
#endif

I've seen a reference to a static checker checking for such stuff
so this problem exists.

Invisibly broken inlcude guards (see qla2xxx patch in the series).