Re: [PATCH] efi: make the min and max mmap slack slots configurable

From: Hamza Mahfooz
Date: Mon Dec 09 2024 - 13:04:12 EST


On 12/9/24 13:00, Ard Biesheuvel wrote:
> On Mon, 9 Dec 2024 at 18:02, Hamza Mahfooz
> <hamzamahfooz@xxxxxxxxxxxxxxxxxxx> wrote:
>>
>> Hi Ard,
>>
>> On 12/9/24 11:40, Ard Biesheuvel wrote:
>>> Hello Hamza,
>>>
>>> Thanks for the patch.
>>>
>>> On Mon, 9 Dec 2024 at 17:25, Hamza Mahfooz
>>> <hamzamahfooz@xxxxxxxxxxxxxxxxxxx> wrote:
>>>>
>>>> Recent platforms
>>>
>>> Which platforms are you referring to here?
>>
>> Grace Blackwell 200 in particular.
>>
>
> Those are arm64 systems, right?

Yup.

>
>>>
>>>> require more slack slots than the current value of
>>>> EFI_MMAP_NR_SLACK_SLOTS, otherwise they fail to boot. So, introduce
>>>> EFI_MIN_NR_MMAP_SLACK_SLOTS and EFI_MAX_NR_MMAP_SLACK_SLOTS
>>>> and use them to determine a number of slots that the platform
>>>> is willing to accept.
>>>>
>>>
>>> What does 'acceptance' mean in this case?
>>
>> Not having allocate_pool() return EFI_BUFFER_TOO_SMALL.
>>
>
> I think you may have gotten confused here - see below
>
>>>
>>>> Cc: stable@xxxxxxxxxxxxxxx
>>>> Cc: Tyler Hicks <code@xxxxxxxxxxx>
>>>> Tested-by: Brian Nguyen <nguyenbrian@xxxxxxxxxxxxx>
>>>> Tested-by: Jacob Pan <panj@xxxxxxxxxxxxx>
>>>> Reviewed-by: Allen Pais <apais@xxxxxxxxxxxxx>
>>>
>>> I appreciate the effort of your colleagues, but if these
>>> tested/reviewed-bys were not given on an open list, they are
>>> meaningless, and I am going to drop them unless the people in question
>>> reply to this thread.
>>>
>>>> Signed-off-by: Hamza Mahfooz <hamzamahfooz@xxxxxxxxxxxxxxxxxxx>
>>>> ---
> ...
>>>> diff --git a/drivers/firmware/efi/libstub/mem.c b/drivers/firmware/efi/libstub/mem.c
>>>> index 4f1fa302234d..cab25183b790 100644
>>>> --- a/drivers/firmware/efi/libstub/mem.c
>>>> +++ b/drivers/firmware/efi/libstub/mem.c
>>>> @@ -13,32 +13,47 @@
>>>> * configuration table
>>>> *
>>>> * Retrieve the UEFI memory map. The allocated memory leaves room for
>>>> - * up to EFI_MMAP_NR_SLACK_SLOTS additional memory map entries.
>>>> + * up to CONFIG_EFI_MAX_NR_MMAP_SLACK_SLOTS additional memory map entries.
>>>> *
>>>> * Return: status code
>>>> */
>>>> efi_status_t efi_get_memory_map(struct efi_boot_memmap **map,
>>>> - bool install_cfg_tbl)
>>>> + bool install_cfg_tbl,
>>>> + unsigned int *n)
>>>
>>> What is the purpose of 'n'? Having single letter names for function
>>> parameters is not great for legibility.
>>>
>>>> {
>>>> int memtype = install_cfg_tbl ? EFI_ACPI_RECLAIM_MEMORY
>>>> : EFI_LOADER_DATA;
>>>> efi_guid_t tbl_guid = LINUX_EFI_BOOT_MEMMAP_GUID;
>>>> + unsigned int nr = CONFIG_EFI_MIN_NR_MMAP_SLACK_SLOTS;
>>>> struct efi_boot_memmap *m, tmp;
>>>> efi_status_t status;
>>>> unsigned long size;
>>>>
>>>> + BUILD_BUG_ON(!is_power_of_2(CONFIG_EFI_MIN_NR_MMAP_SLACK_SLOTS) ||
>>>> + !is_power_of_2(CONFIG_EFI_MAX_NR_MMAP_SLACK_SLOTS) ||
>>>> + CONFIG_EFI_MIN_NR_MMAP_SLACK_SLOTS >=
>>>> + CONFIG_EFI_MAX_NR_MMAP_SLACK_SLOTS);
>>>> +
>>>> tmp.map_size = 0;
>>>> status = efi_bs_call(get_memory_map, &tmp.map_size, NULL, &tmp.map_key,
>>>> &tmp.desc_size, &tmp.desc_ver);
>>>> if (status != EFI_BUFFER_TOO_SMALL)
>>>> return EFI_LOAD_ERROR;
>>>>
>>>> - size = tmp.map_size + tmp.desc_size * EFI_MMAP_NR_SLACK_SLOTS;
>>>> - status = efi_bs_call(allocate_pool, memtype, sizeof(*m) + size,
>>>> - (void **)&m);
>>>> + do {
>>>> + size = tmp.map_size + tmp.desc_size * nr;
>>>> + status = efi_bs_call(allocate_pool, memtype, sizeof(*m) + size,
>>>> + (void **)&m);
>>>> + nr <<= 1;
>>>> + } while (status == EFI_BUFFER_TOO_SMALL &&
>>>> + nr <= CONFIG_EFI_MAX_NR_MMAP_SLACK_SLOTS);
>>>> +
>>>
>>> Under what circumstances would you expect AllocatePool() to return
>>> EFI_BUFFER_TOO_SMALL? What is the purpose of this loop?
>>
>> We have observed that allocate_pool() will return EFI_BUFFER_TOO_SMALL
>> if EFI_MMAP_NR_SLACK_SLOTS is less than 32. The loop is there so only
>> the minimum number of extra slots are allocated.
>>
>
> But allocate_pool() *never* returns EFI_BUFFER_TOO_SMALL. It is
> get_memory_map() that may return EFI_BUFFER_TOO_SMALL if the memory
> map is larger than the provided buffer. In this case, allocate_pool()
> needs to be called again to allocate a buffer of the appropriate size.
>
> So the loop needs to call get_memory_map() again, but given that the
> size is returned directly when the first call fails, this iterative
> logic seems misguided.
>
> I also think you might be misunderstanding the purpose of the slack
> slots. They exist to reduce the likelihood that the memory map grows
> more entries than can be accommodated in the buffer in cases where the
> first call to ExitBootServices() fails, and GetMemoryMap() needs to be
> called again; at that point, memory allocations are no longer possible
> (although the UEFI spec was relaxed in this regard between 2.6 and
> 2.10).
>
>
>>>
>>> How did you test this code?
>>
>> I was able to successfully boot the platform with this patch applied,
>> without it we need to append `efi=disable_early_pci_dma` to the kernel's
>> cmdline be able to boot the system.
>>
>
> allocate_pool() never returns EFI_BUFFER_TOO_SMALL, and so your loop
> executes only once. I cannot explain how this happens to fix the boot
> for you, but your patch as presented is deeply flawed.
>
> If bumping the number of slots to 32 also solves the problem, I'd
> happily consider that instead,

Ya, lets go with that, in that case.

>
>
>>
>>>
>>>> if (status != EFI_SUCCESS)
>>>> return status;
>>>>
>>>> + if (n)
>>>> + *n = nr;
>>>> +
>
> It seems to me that at this point, nr has been doubled after it was
> used to perform the allocation, so you are returning a wrong value
> here.