Re: [PATCH v2] x86, efi: Do not reserve boot services regions withinreserved areas

From: Maarten Lankhorst
Date: Tue Jun 14 2011 - 14:32:22 EST


Op 14-06-11 20:10, Yinghai Lu schreef:
> On 06/14/2011 09:19 AM, Maarten Lankhorst wrote:
>> Commit 916f676f8dc started reserving boot service code since some systems
>> require you to keep that code around until SetVirtualAddressMap is called.
>>
>> However, in some cases those areas will overlap with reserved regions.
>> The proper medium-term fix is to fix the bootloader to prevent the
>> conflicts from occurring by moving the kernel to a better position,
>> but the kernel should check for this possibility, and only reserve regions
>> which can be reserved.
>>
>> Signed-off-by: Maarten Lankhorst <m.b.lankhorst@xxxxxxxxx>
>>
>> ---
>>
>> V2: Removed some printks and a unrelated change
>>
>> diff --git a/arch/x86/include/asm/memblock.h b/arch/x86/include/asm/memblock.h
>> index 19ae14b..0cd3800 100644
>> --- a/arch/x86/include/asm/memblock.h
>> +++ b/arch/x86/include/asm/memblock.h
>> @@ -4,7 +4,6 @@
>> #define ARCH_DISCARD_MEMBLOCK
>>
>> u64 memblock_x86_find_in_range_size(u64 start, u64 *sizep, u64 align);
>> -void memblock_x86_to_bootmem(u64 start, u64 end);
>>
>> void memblock_x86_reserve_range(u64 start, u64 end, char *name);
>> void memblock_x86_free_range(u64 start, u64 end);
>> @@ -19,5 +18,6 @@ u64 memblock_x86_hole_size(u64 start, u64 end);
>> u64 memblock_x86_find_in_range_node(int nid, u64 start, u64 end, u64 size, u64 align);
>> u64 memblock_x86_free_memory_in_range(u64 addr, u64 limit);
>> u64 memblock_x86_memory_in_range(u64 addr, u64 limit);
>> +bool memblock_x86_check_reserved_size(u64 *addrp, u64 *sizep, u64 align);
>>
>> #endif
>> diff --git a/arch/x86/mm/memblock.c b/arch/x86/mm/memblock.c
>> index aa11693..992da5e 100644
>> --- a/arch/x86/mm/memblock.c
>> +++ b/arch/x86/mm/memblock.c
>> @@ -8,7 +8,7 @@
>> #include <linux/range.h>
>>
>> /* Check for already reserved areas */
>> -static bool __init check_with_memblock_reserved_size(u64 *addrp, u64 *sizep, u64 align)
>> +bool __init memblock_x86_check_reserved_size(u64 *addrp, u64 *sizep, u64 align)
>> {
>> struct memblock_region *r;
>> u64 addr = *addrp, last;
>> @@ -59,7 +59,7 @@ u64 __init memblock_x86_find_in_range_size(u64 start, u64 *sizep, u64 align)
>> if (addr >= ei_last)
>> continue;
>> *sizep = ei_last - addr;
>> - while (check_with_memblock_reserved_size(&addr, sizep, align))
>> + while (memblock_x86_check_reserved_size(&addr, sizep, align))
>> ;
>>
>> if (*sizep)
>> diff --git a/arch/x86/platform/efi/efi.c b/arch/x86/platform/efi/efi.c
>> index 0d3a4fa..b8afde4 100644
>> --- a/arch/x86/platform/efi/efi.c
>> +++ b/arch/x86/platform/efi/efi.c
>> @@ -310,14 +310,30 @@ void __init efi_reserve_boot_services(void)
>>
>> for (p = memmap.map; p < memmap.map_end; p += memmap.desc_size) {
>> efi_memory_desc_t *md = p;
>> - unsigned long long start = md->phys_addr;
>> - unsigned long long size = md->num_pages << EFI_PAGE_SHIFT;
>> + u64 start = md->phys_addr;
>> + u64 size = md->num_pages << EFI_PAGE_SHIFT;
>>
>> if (md->type != EFI_BOOT_SERVICES_CODE &&
>> md->type != EFI_BOOT_SERVICES_DATA)
>> continue;
>> -
>> - memblock_x86_reserve_range(start, start + size, "EFI Boot");
>> + /* Only reserve where possible:
>> + * - Not within any already allocated areas
>> + * - Not over any memory area (really needed, if above?)
>> + * - Not within any part of the kernel
>> + * - Not the bios reserved area
>> + */
>> + if ((start+size >= virt_to_phys(_text)
>> + && start <= virt_to_phys(_end)) ||
>> + !e820_all_mapped(start, start+size, E820_RAM) ||
>> + memblock_x86_check_reserved_size(&start, &size,
>> + 1<<EFI_PAGE_SHIFT)) {
>> + /* Could not reserve, skip it */
>> + md->num_pages = 0;
>> + memblock_dbg(PFX "Could not reserve boot area "
>> + "[0x%llx-0x%llx)\n", start, start+size);
> how about partial overlapping?
If any overlap is detected I don't reserve the area. mjg is working on a
patch to prevent this from happening, but allocating partial ranges is
simply not worth the effort. Seems it's not useful either, since the
failures I get are range 0-8000 (bios reserved 0-10000), a subset of
what memblock already completely used for its configuration, and the
remainder being completely blocked by the kernel.

The more invasive fix is moving it to grub and hope everyone will
fix their bootloader immediately, but until then this is as good as it
gets. I don't think we'd get enough benefits about reserving partial
ranges, compared to how complicated the code would get.

~Maarten

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/