Re: [PATCHv3] gcc: disable '-Wstrignop-overread' universally for gcc-13+ and FORTIFY_SOURCE

From: Kees Cook
Date: Thu Dec 12 2024 - 13:24:54 EST


On Sun, Dec 08, 2024 at 09:42:28PM +0530, Nilay Shroff wrote:
> While building the powerpc code using gcc 13.x, I came across following
> errors generated for kernel/padata.c file:
>
> CC kernel/padata.o
> In file included from ./include/linux/string.h:390,
> from ./arch/powerpc/include/asm/paca.h:16,
> from ./arch/powerpc/include/asm/current.h:13,
> from ./include/linux/thread_info.h:23,
> from ./include/asm-generic/preempt.h:5,
> from ./arch/powerpc/include/generated/asm/preempt.h:1,
> from ./include/linux/preempt.h:79,
> from ./include/linux/spinlock.h:56,
> from ./include/linux/swait.h:7,
> from ./include/linux/completion.h:12,
> from kernel/padata.c:14:
> In function ‘bitmap_copy’,
> inlined from ‘cpumask_copy’ at ./include/linux/cpumask.h:839:2,
> inlined from ‘__padata_set_cpumasks’ at kernel/padata.c:730:2:
> ./include/linux/fortify-string.h:114:33: error: ‘__builtin_memcpy’ reading between 257 and 536870904 bytes from a region of size 256 [-Werror=stringop-overread]
> 114 | #define __underlying_memcpy __builtin_memcpy
> | ^
> ./include/linux/fortify-string.h:633:9: note: in expansion of macro ‘__underlying_memcpy’
> 633 | __underlying_##op(p, q, __fortify_size); \
> | ^~~~~~~~~~~~~
> ./include/linux/fortify-string.h:678:26: note: in expansion of macro ‘__fortify_memcpy_chk’
> 678 | #define memcpy(p, q, s) __fortify_memcpy_chk(p, q, s, \
> | ^~~~~~~~~~~~~~~~~~~~
> ./include/linux/bitmap.h:259:17: note: in expansion of macro ‘memcpy’
> 259 | memcpy(dst, src, len);
> | ^~~~~~
> kernel/padata.c: In function ‘__padata_set_cpumasks’:
> kernel/padata.c:713:48: note: source object ‘pcpumask’ of size [0, 256]
> 713 | cpumask_var_t pcpumask,
> | ~~~~~~~~~~~~~~^~~~~~~~

We've seen this also on x86 with 320 CPUs[1] (which perhaps was already
known since I see Thomas Weißschuh in CC already.

Building with -fdiagnostics-details we see:

In function 'bitmap_copy',
inlined from 'cpumask_copy' at ../include/linux/cpumask.h:839:2,
inlined from '__padata_set_cpumasks' at ../kernel/padata.c:723:2:
../include/linux/fortify-string.h:114:33: error: '__builtin_memcpy' reading between 41 and 536870904 bytes from a region of size 40 [-Werror=stringop-overread]
114 | #define __underlying_memcpy __builtin_memcpy
| ^
../include/linux/fortify-string.h:633:9: note: in expansion of macro '__underlying_memcpy'
633 | __underlying_##op(p, q, __fortify_size); \
| ^~~~~~~~~~~~~
../include/linux/fortify-string.h:678:26: note: in expansion of macro '__fortify_memcpy_chk'
678 | #define memcpy(p, q, s) __fortify_memcpy_chk(p, q, s, \
| ^~~~~~~~~~~~~~~~~~~~
../include/linux/bitmap.h:259:17: note: in expansion of macro 'memcpy'
259 | memcpy(dst, src, len);
| ^~~~~~
'__padata_set_cpumasks': events 1-2
../include/linux/fortify-string.h:613:36:
612 | if (p_size_field != SIZE_MAX &&
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~
613 | p_size != p_size_field && p_size_field < size)
| ~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~
| |
| (1) when the condition is evaluated to false
| (2) when the condition is evaluated to true
'__padata_set_cpumasks': event 3
114 | #define __underlying_memcpy __builtin_memcpy
| ^
| |
| (3) out of array bounds here

What's happening is that GCC is seeing that the run-time checks of
FORTIFY_SOURCE is checking for this condition (i.e. there is a path
through the code where the bounds could be too large and it still calls
memcpy) -- which is the point of this runtime check -- but that GCC has
found a compile-time path where this can be true.

Built without CONFIG_WERROR, we can examine the padata.o file for the
FORTIFY_SOURCE warning string to find the field:

$ strings gcc-diag/kernel/padata.o | grep ^field
field "dst" at include/linux/bitmap.h:259

which confirms it is this, which has already been seen in the thread:

static __always_inline
void bitmap_copy(unsigned long *dst, const unsigned long *src, unsigned
int nbits)
{
unsigned int len = bitmap_size(nbits);

if (small_const_nbits(nbits))
*dst = *src;
else
memcpy(dst, src, len);
}

and I think what Nathan already discussed[2] is all true (i.e. that
nr_cpu_ids is unknown -- but bounded to a smaller range than [0..UINT_MAX]
due to the calculation in bitmap_size()). Nathan's fix silences the
warning. We could narrow the scope to only run-time-specified nr_cpus:


diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
index 9278a50d514f..8f1a694109e9 100644
--- a/include/linux/cpumask.h
+++ b/include/linux/cpumask.h
@@ -836,6 +836,8 @@ void cpumask_shift_left(struct cpumask *dstp, const struct cpumask *srcp, int n)
static __always_inline
void cpumask_copy(struct cpumask *dstp, const struct cpumask *srcp)
{
+ if (!__builtin_constant_p(large_cpumask_bits))
+ BUG_ON(large_cpumask_bits > NR_CPUS);
bitmap_copy(cpumask_bits(dstp), cpumask_bits(srcp), large_cpumask_bits);
}


Or we could avoid the BUG_ON check and simply silence the warning
explicitly:


diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
index 9278a50d514f..0725b26f21b8 100644
--- a/include/linux/cpumask.h
+++ b/include/linux/cpumask.h
@@ -836,6 +836,8 @@ void cpumask_shift_left(struct cpumask *dstp, const struct cpumask *srcp, int n)
static __always_inline
void cpumask_copy(struct cpumask *dstp, const struct cpumask *srcp)
{
+ if (!__builtin_constant_p(large_cpumask_bits))
+ OPTIMIZER_HIDE_VAR(large_cpumask_bits);
bitmap_copy(cpumask_bits(dstp), cpumask_bits(srcp), large_cpumask_bits);
}

Or we could unconditionally put the OPTIMIZER_HIDE_VAR() inside
bitmap_copy() itself:


diff --git a/include/linux/bitmap.h b/include/linux/bitmap.h
index 262b6596eca5..5503ccabe05a 100644
--- a/include/linux/bitmap.h
+++ b/include/linux/bitmap.h
@@ -251,12 +251,14 @@ static __always_inline void bitmap_fill(unsigned long *dst, unsigned int nbits)
static __always_inline
void bitmap_copy(unsigned long *dst, const unsigned long *src, unsigned int nbits)
{
- unsigned int len = bitmap_size(nbits);
-
- if (small_const_nbits(nbits))
+ if (small_const_nbits(nbits)) {
*dst = *src;
- else
+ } else {
+ unsigned int len = bitmap_size(nbits);
+
+ OPTIMIZER_HIDE_VAR(len);
memcpy(dst, src, len);
+ }
}

/*

I prefer any of these to doing the build-system disabling of the
warning.

-Kees

[1] https://lore.kernel.org/all/202411021337.85E9BB06@keescook/
[2] https://lore.kernel.org/all/20241209193558.GA1597021@ax162/

--
Kees Cook