Re: [tip: efi/core] efi: cper: fix scnprintf() use in cper_mem_err_location()

From: Ard Biesheuvel
Date: Wed Sep 01 2021 - 02:51:10 EST


On Tue, 31 Aug 2021 at 18:02, James Morse <james.morse@xxxxxxx> wrote:
>
> Hi guys,
>
> On 28/08/2021 13:18, Ard Biesheuvel wrote:
> > (add RAS/APEI folks)
> >
> > On Sat, 28 Aug 2021 at 13:31, Joe Perches <joe@xxxxxxxxxxx> wrote:
> >>
> >> On Sat, 2021-08-28 at 10:37 +0000, tip-bot2 for Rasmus Villemoes wrote:
> >>> The following commit has been merged into the efi/core branch of tip:
> >> []
> >>> efi: cper: fix scnprintf() use in cper_mem_err_location()
> >>>
> >>> The last two if-clauses fail to update n, so whatever they might have
> >>> written at &msg[n] would be cut off by the final nul-termination.
> >>>
> >>> That nul-termination is redundant; scnprintf(), just like snprintf(),
> >>> guarantees a nul-terminated output buffer, provided the buffer size is
> >>> positive.
> >>>
> >>> And there's no need to discount one byte from the initial buffer;
> >>> vsnprintf() expects to be given the full buffer size - it's not going
> >>> to write the nul-terminator one beyond the given (buffer, size) pair.
> >> []
> >>> diff --git a/drivers/firmware/efi/cper.c b/drivers/firmware/efi/cper.c
> >> []
> >>> @@ -221,7 +221,7 @@ static int cper_mem_err_location(struct cper_mem_err_compact *mem, char *msg)
> >>> return 0;
> >>>
> >>>
> >>> n = 0;
> >>> - len = CPER_REC_LEN - 1;
> >>> + len = CPER_REC_LEN;
> >>> if (mem->validation_bits & CPER_MEM_VALID_NODE)
> >>> n += scnprintf(msg + n, len - n, "node: %d ", mem->node);
> >>> if (mem->validation_bits & CPER_MEM_VALID_CARD)
> >>
> >> [etc...]
> >>
> >> Is this always single threaded?
> >>
> >> It doesn't seem this is safe for reentry as the output buffer
> >> being written into is a single static
> >>
> >> static char rcd_decode_str[CPER_REC_LEN];
>
> > Good question. CPER error record decoding typically occurs in response
> > to an error event raised by firmware, so I think this happens to work
> > fine in practice. Whether this is guaranteed, I'm not so sure ...
>
> There is locking to prevent concurrent access to the firmware buffer, but that only
> serialises the CPER records being copied. The printing may happen in parallel on different
> CPUs if there are multiple errors.
>
> cper_estatus_print() is called in NMI context if an NMI indicates a fatal error. See
> __ghes_panic().
>

OK, better to fix it then - there does not seem to be a good reason
for using a buffer in BSS here anyway.

I'll send out a patch.

Thanks,
Ard.