Re: [PATCH] coresight: etm4x: work around clang-12+ build failure
From: Nick Desaulniers
Date: Thu Feb 25 2021 - 16:24:46 EST
On Thu, Feb 25, 2021 at 8:45 AM Mathieu Poirier
<mathieu.poirier@xxxxxxxxxx> wrote:
>
> Good morning,
>
> On Thu, Feb 25, 2021 at 10:42:58AM +0100, Arnd Bergmann wrote:
> > From: Arnd Bergmann <arnd@xxxxxxxx>
> >
> > clang-12 fails to build the etm4x driver with -fsanitize=array-bounds:
> >
> > <instantiation>:1:7: error: expected constant expression in '.inst' directive
> > .inst (0xd5200000|((((2) << 19) | ((1) << 16) | (((((((((((0x160 + (i * 4))))) >> 2))) >> 7) & 0x7)) << 12) | ((((((((((0x160 + (i * 4))))) >> 2))) & 0xf)) << 8) | (((((((((((0x160 + (i * 4))))) >> 2))) >> 4) & 0x7)) << 5)))|(.L__reg_num_x8))
> > ^
> > drivers/hwtracing/coresight/coresight-etm4x-core.c:702:4: note: while in macro instantiation
> > etm4x_relaxed_read32(csa, TRCCNTVRn(i));
> > ^
> > drivers/hwtracing/coresight/coresight-etm4x.h:403:4: note: expanded from macro 'etm4x_relaxed_read32'
> > read_etm4x_sysreg_offset((offset), false)))
> > ^
> > drivers/hwtracing/coresight/coresight-etm4x.h:383:12: note: expanded from macro 'read_etm4x_sysreg_offset'
> > __val = read_etm4x_sysreg_const_offset((offset)); \
> > ^
> > drivers/hwtracing/coresight/coresight-etm4x.h:149:2: note: expanded from macro 'read_etm4x_sysreg_const_offset'
> > READ_ETM4x_REG(ETM4x_OFFSET_TO_REG(offset))
> > ^
> > drivers/hwtracing/coresight/coresight-etm4x.h:144:2: note: expanded from macro 'READ_ETM4x_REG'
> > read_sysreg_s(ETM4x_REG_NUM_TO_SYSREG((reg)))
> > ^
> > arch/arm64/include/asm/sysreg.h:1108:15: note: expanded from macro 'read_sysreg_s'
> > asm volatile(__mrs_s("%0", r) : "=r" (__val)); \
> > ^
> > arch/arm64/include/asm/sysreg.h:1074:2: note: expanded from macro '__mrs_s'
> > " mrs_s " v ", " __stringify(r) "\n" \
> > ^
> >
> > It appears that the __builin_constant_p() check in
> > read_etm4x_sysreg_offset() falsely returns 'true' here because clang
> > decides finds that an out-of-bounds access to config->cntr_val[] cannot
> > happen, and then it unrolls the loop with constant register numbers. Then
Is a sanitizer enabled, that would trap on OOB?
> > when actually emitting the output, it fails to figure out the value again.
> >
> > While this is incorrect behavior in clang, it is easy to work around
> > by avoiding the out-of-bounds array access. Do this by limiting the
> > loop counter to the actual dimension of the array.
> >
> > Link: https://github.com/ClangBuiltLinux/linux/issues/1310
> > Signed-off-by: Arnd Bergmann <arnd@xxxxxxxx>
> > ---
> > drivers/hwtracing/coresight/coresight-etm4x-core.c | 4 ++--
> > 1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/hwtracing/coresight/coresight-etm4x-core.c b/drivers/hwtracing/coresight/coresight-etm4x-core.c
> > index 15016f757828..4cccf874a602 100644
> > --- a/drivers/hwtracing/coresight/coresight-etm4x-core.c
> > +++ b/drivers/hwtracing/coresight/coresight-etm4x-core.c
> > @@ -691,13 +691,13 @@ static void etm4_disable_hw(void *info)
> > "timeout while waiting for PM stable Trace Status\n");
> >
> > /* read the status of the single shot comparators */
> > - for (i = 0; i < drvdata->nr_ss_cmp; i++) {
> > + for (i = 0; i < min_t(u32, drvdata->nr_ss_cmp, ETM_MAX_SS_CMP); i++) {
> > config->ss_status[i] =
> > etm4x_relaxed_read32(csa, TRCSSCSRn(i));
> > }
> >
> > /* read back the current counter values */
> > - for (i = 0; i < drvdata->nr_cntr; i++) {
> > + for (i = 0; i < min_t(u32, drvdata->nr_cntr, ETMv4_MAX_CNTR); i++) {
>
> This patch will work and I'd be happy to apply it if this was the only instance,
> but there are dozens of places in the coresight drivers where such patterns are
> used. Why are those not flagged as well? And shouldn't the real fix be with
> clang?
It's important to understand the __builtin_constant_p is highly
sensitive to optimizations; code using it typically relies on
optimizations being performed before it's evaluated. Which
optimizations, applied successfully or not, in what order, by which
compiler or versions of the same compiler can affect what
__builtin_constant_p evaluates to. Code generally needs to be written
to assume that failure for __builtin_constant_p to evaluate to a
specific value or not is _not a bug_.
>
> Thanks,
> Mathieu
>
> > config->cntr_val[i] =
> > etm4x_relaxed_read32(csa, TRCCNTVRn(i));
> > }
> > --
> > 2.29.2
> >
--
Thanks,
~Nick Desaulniers