Re: [tip: efi/core] efi: cper: fix scnprintf() use in cper_mem_err_location()

From: James Morse
Date: Tue Aug 31 2021 - 12:02:36 EST


Hi guys,

On 28/08/2021 13:18, Ard Biesheuvel wrote:
> (add RAS/APEI folks)
>
> On Sat, 28 Aug 2021 at 13:31, Joe Perches <joe@xxxxxxxxxxx> wrote:
>>
>> On Sat, 2021-08-28 at 10:37 +0000, tip-bot2 for Rasmus Villemoes wrote:
>>> The following commit has been merged into the efi/core branch of tip:
>> []
>>> efi: cper: fix scnprintf() use in cper_mem_err_location()
>>>
>>> The last two if-clauses fail to update n, so whatever they might have
>>> written at &msg[n] would be cut off by the final nul-termination.
>>>
>>> That nul-termination is redundant; scnprintf(), just like snprintf(),
>>> guarantees a nul-terminated output buffer, provided the buffer size is
>>> positive.
>>>
>>> And there's no need to discount one byte from the initial buffer;
>>> vsnprintf() expects to be given the full buffer size - it's not going
>>> to write the nul-terminator one beyond the given (buffer, size) pair.
>> []
>>> diff --git a/drivers/firmware/efi/cper.c b/drivers/firmware/efi/cper.c
>> []
>>> @@ -221,7 +221,7 @@ static int cper_mem_err_location(struct cper_mem_err_compact *mem, char *msg)
>>> return 0;
>>>
>>>
>>> n = 0;
>>> - len = CPER_REC_LEN - 1;
>>> + len = CPER_REC_LEN;
>>> if (mem->validation_bits & CPER_MEM_VALID_NODE)
>>> n += scnprintf(msg + n, len - n, "node: %d ", mem->node);
>>> if (mem->validation_bits & CPER_MEM_VALID_CARD)
>>
>> [etc...]
>>
>> Is this always single threaded?
>>
>> It doesn't seem this is safe for reentry as the output buffer
>> being written into is a single static
>>
>> static char rcd_decode_str[CPER_REC_LEN];

> Good question. CPER error record decoding typically occurs in response
> to an error event raised by firmware, so I think this happens to work
> fine in practice. Whether this is guaranteed, I'm not so sure ...

There is locking to prevent concurrent access to the firmware buffer, but that only
serialises the CPER records being copied. The printing may happen in parallel on different
CPUs if there are multiple errors.

cper_estatus_print() is called in NMI context if an NMI indicates a fatal error. See
__ghes_panic().


Thanks,

James