Re: [PATCH v2] scripts/setlocalversion: make git describe output more reliable

From: Nico Schottelius
Date: Thu Sep 17 2020 - 08:31:35 EST



Thanks for the patch Rasmus. Overall it looks good to me, be aligned to
the stable patch submission rules makes sense. A tiny thing though:

I did not calculate the exact collision probability with 12 characters
and it does not make sense to even discuss this, if this is a current
rule for stable patches. However we have a couple of 12's scattered in
the code. And if the submission rule changes in the future, we should
have a single location to update it.

So I suggest you introduce something on the line of:

...
num_chars=12
...
--abbrev=$num_chars
...

I guess you get the picture.

Greetings from the sunny mountains,

Nico


Rasmus Villemoes <linux@xxxxxxxxxxxxxxxxxx> writes:

> When building for an embedded target using Yocto, we're sometimes
> observing that the version string that gets built into vmlinux (and
> thus what uname -a reports) differs from the path under /lib/modules/
> where modules get installed in the rootfs, but only in the length of
> the -gabc123def suffix. Hence modprobe always fails.
>
> The problem is that Yocto has the concept of "sstate" (shared state),
> which allows different developers/buildbots/etc. to share build
> artifacts, based on a hash of all the metadata that went into building
> that artifact - and that metadata includes all dependencies (e.g. the
> compiler used etc.). That normally works quite well; usually a clean
> build (without using any sstate cache) done by one developer ends up
> being binary identical to a build done on another host. However, one
> thing that can cause two developers to end up with different builds
> [and thus make one's vmlinux package incompatible with the other's
> kernel-dev package], which is not captured by the metadata hashing, is
> this `git describe`: The output of that can be affected by
>
> (1) git version: before 2.11 git defaulted to a minimum of 7, since
> 2.11 (git.git commit e6c587) the default is dynamic based on the
> number of objects in the repo
> (2) hence even if both run the same git version, the output can differ
> based on how many remotes are being tracked (or just lots of local
> development branches or plain old garbage)
> (3) and of course somebody could have a core.abbrev config setting in
> ~/.gitconfig
>
> So in order to avoid `uname -a` output relying on such random details
> of the build environment which are rather hard to ensure are
> consistent between developers and buildbots, make sure the abbreviated
> sha1 always consists of exactly 12 hex characters. That is consistent
> with the current rule for -stable patches, and is almost always enough
> to identify the head commit unambigously - in the few cases where it
> does not, the v5.4.3-00021- prefix would certainly nail it down.
>
> Signed-off-by: Rasmus Villemoes <linux@xxxxxxxxxxxxxxxxxx>
> ---
> v2: use 12 instead of 15, and ensure that the result does have exactly
> 12 hex chars.
>
> scripts/setlocalversion | 21 ++++++++++++++++-----
> 1 file changed, 16 insertions(+), 5 deletions(-)
>
> diff --git a/scripts/setlocalversion b/scripts/setlocalversion
> index 20f2efd57b11..bb709eda96cd 100755
> --- a/scripts/setlocalversion
> +++ b/scripts/setlocalversion
> @@ -45,7 +45,7 @@ scm_version()
>
> # Check for git and a git repo.
> if test -z "$(git rev-parse --show-cdup 2>/dev/null)" &&
> - head=$(git rev-parse --verify --short HEAD 2>/dev/null); then
> + head=$(git rev-parse --verify HEAD 2>/dev/null); then
>
> # If we are at a tagged commit (like "v2.6.30-rc6"), we ignore
> # it, because this version is defined in the top level Makefile.
> @@ -59,11 +59,22 @@ scm_version()
> fi
> # If we are past a tagged commit (like
> # "v2.6.30-rc5-302-g72357d5"), we pretty print it.
> - if atag="$(git describe 2>/dev/null)"; then
> - echo "$atag" | awk -F- '{printf("-%05d-%s", $(NF-1),$(NF))}'
> -
> - # If we don't have a tag at all we print -g{commitish}.
> + #
> + # Ensure the abbreviated sha1 has exactly 12
> + # hex characters, to make the output
> + # independent of git version, local
> + # core.abbrev settings and/or total number of
> + # objects in the current repository - passing
> + # --abbrev=12 ensures a minimum of 12, and the
> + # awk substr() then picks the 'g' and first 12
> + # hex chars.
> + if atag="$(git describe --abbrev=12 2>/dev/null)"; then
> + echo "$atag" | awk -F- '{printf("-%05d-%s", $(NF-1),substr($(NF),0,13))}'
> +
> + # If we don't have a tag at all we print -g{commitish},
> + # again using exactly 12 hex chars.
> else
> + head="$(echo $head | cut -c1-12)"
> printf '%s%s' -g $head
> fi
> fi


--
Modern, affordable, Swiss Virtual Machines. Visit www.datacenterlight.ch