Re: [PATCH] lib: Always NUL terminate ucs2_as_utf8
From: Laszlo Ersek
Date: Wed Apr 20 2016 - 08:45:15 EST
On 04/20/16 11:41, Chris Wilson wrote:
> On Wed, Apr 20, 2016 at 11:36:37AM +0200, Laszlo Ersek wrote:
>> On 04/20/16 10:37, Chris Wilson wrote:
>>> If the caller, in this case efivarfs_callback(), only provides sufficent
>>> room for the expanded utf8 and not enough to include the terminating NUL
>>> byte, that NUL byte is skipped.
>>
>> How does that occur? In efivarfs_callback() [fs/efivarfs/super.c], we have
>>
>> len = ucs2_utf8size(entry->var.VariableName);
>>
>> /* name, plus '-', plus GUID, plus NUL*/
>> name = kmalloc(len + 1 + EFI_VARIABLE_GUID_LEN + 1, GFP_KERNEL);
>> if (!name)
>> goto fail;
>>
>> ucs2_as_utf8(name, entry->var.VariableName, len);
>>
>> Instead, I think the following might be happening (note that RIP points
>> into efivar_variable_is_removable(), and I guess variable_matches()
>> (which is static) is inlined):
>>
>> efivarfs_callback() [fs/efivarfs/super.c]
>> efivar_variable_is_removable() [drivers/firmware/efi/vars.c]
>> variable_matches() [drivers/firmware/efi/vars.c]
>>
>> The bug seems to be in variable_matches(), which doesn't consider the
>> "len" parameter early enough. Namely, consider that we have the
>> following input:
>>
>> - var_name: "a"
>> - len: 1
>> - match_name "ab"
>>
>> In the first iteration of the loop (i.e., *match == 0):
>> - c = 'a'
>> - u = 'a'
>> - *match gets incremented to 1.
>>
>> In the second iteration of the loop (i.e., *match == 1):
>> - c = 'b'
>> - u = <indeterminate value> (that is, undefined behavior),
>> because (*match == len).
>>
>> This seems to be consistent with the error message "Caught 8-bit read
>> from uninitialized memory": namely, the array allocated for "name" in
>> efivarfs_callback() is indeed not pre-zeroed, and the ucs2_as_utf8()
>> function does not populate name[len] -- correctly, I would say.
>
> ucs2_as_utf8 reports that it returns a NUL terminated string.
I don't think it does. Here's the comment:
/*
* copy at most maxlength bytes of whole utf8 characters to dest from the
* ucs2 string src.
*
* The return value is the number of characters copied, not including the
* final NUL character.
*/
It doesn't seem to promise that the output will always be NUL-terminated. And, the code explicitly considers the case when there is no room for the final NUL.
A strictly NUL-terminated output might make for a better interface, but then the comment should be updated as well. Plus, I'm unsure if it would be aligned with Peter's original goal (and the current call sites). I'm not against changing the interface contract; I'll let Peter speak up.
> It didn't
> in this case.
> -Chris
>