Re: [PATCH v2] kheaders: prevent `find` from seeing perl temp files

From: Masahiro Yamada
Date: Wed Dec 18 2024 - 03:36:17 EST


On Fri, Dec 6, 2024 at 9:00 AM HONG Yifan <elsk@xxxxxxxxxx> wrote:
>
> Symptom:
>
> The command
>
> find ... | xargs ... perl -i
>
> occasionally triggers error messages like the following, with the build
> still succeeding:
>
> Can't open <redacted>/kernel/.tmp_cpio_dir/include/dt-bindings/clock/XXNX4nW9: No such file or directory.
>
> Analysis:
>
> With strace, the root cause has been identified to be `perl -i` creating
> temporary files inside $cpio_dir, which causes `find` to see the
> temporary files and emit the names. `find` is likely implemented with
> readdir. POSIX `readdir` says:
>
> If a file is removed from or added to the directory after the most
> recent call to opendir() or rewinddir(), whether a subsequent call
> to readdir() returns an entry for that file is unspecified.
>
> So if the libc that `find` links against choose to return that entry
> in readdir(), a possible sequence of events is the following:
>
> 1. find emits foo.h
> 2. xargs executes `perl -i foo.h`
> 3. perl (pid=100) creates temporary file `XXXXXXXX`
> 4. find sees file `XXXXXXXX` and emit it
> 5. PID 100 exits, cleaning up the temporary file `XXXXXXXX`
> 6. xargs executes `perl -i XXXXXXXX`
> 7. perl (pid=200) tries to read the file, but it doesn't exist any more.
>
> ... triggering the error message.
>
> One can reproduce the bug with the following command (assuming PWD
> contains the list of headers in kheaders.tar.xz)
>
> for i in $(seq 100); do
> find -type f -print0 |
> xargs -0 -P8 -n1 perl -pi -e 'BEGIN {undef $/;}; s/\/\*((?!SPDX).)*?\*\///smg;';
> done
>
> With a `find` linking against musl libc, the error message is emitted
> 6/100 times.
>
> The fix:
>
> This change store the results of `find` before feeding them into xargs.


store -> stores



>
> diff --git a/kernel/gen_kheaders.sh b/kernel/gen_kheaders.sh
> index 7e1340da5aca..4d70e6377da5 100755
> --- a/kernel/gen_kheaders.sh
> +++ b/kernel/gen_kheaders.sh
> @@ -84,8 +84,12 @@ for f in $dir_list;
> done | cpio --quiet -pdu $cpio_dir >/dev/null 2>&1
>
> # Remove comments except SDPX lines
> -find $cpio_dir -type f -print0 |
> +# Use a temporary file to store directory contents to prevent find/xargs from
> +# seeing temporary files created by perl.
> +find $cpio_dir -type f -print0 > "${cpio_dir}.contents.txt"
> +cat "${cpio_dir}.contents.txt" | \
> xargs -0 -P8 -n1 perl -pi -e 'BEGIN {undef $/;}; s/\/\*((?!SPDX).)*?\*\///smg;'


shellcheck pointed out the useless 'cat'.


In kernel/gen_kheaders.sh line 90:
cat "${cpio_dir}.contents.txt" | \
^------------------------^ SC2002 (style): Useless cat. Consider
'cmd < file | ..' or 'cmd file | ..' instead.



xargs ... < "${cpio_dir}.contents.txt"




Maybe

xargs -a "${cpio_dir}.contents.txt" ...

can be possible, but the -a option is not described in OpenGroup.

https://pubs.opengroup.org/onlinepubs/9699919799/utilities/xargs.html






> +rm -f "${cpio_dir}.contents.txt"
>
> # Create archive and try to normalize metadata for reproducibility.
> tar "${KBUILD_BUILD_TIMESTAMP:+--mtime=$KBUILD_BUILD_TIMESTAMP}" \
> --
> 2.47.0.338.g60cca15819-goog
>


--
Best Regards

Masahiro Yamada