Re: [PATCH 06/20] early_res: seperate common memmap func from e820.cto fw_memmap.cy

From: Yinghai Lu
Date: Mon Mar 22 2010 - 19:42:53 EST


On 03/22/2010 03:53 PM, Thomas Gleixner wrote:
> On Mon, 22 Mar 2010, Yinghai Lu wrote:
>> On 03/22/2010 03:09 PM, Thomas Gleixner wrote:
>>> On Mon, 22 Mar 2010, Yinghai Lu wrote:
>>>> On 03/22/2010 12:37 PM, Ingo Molnar wrote:
>>
>>>> 1. need to keep e820
>>>
>>> That's neither an argument for using lmb nor an argument not to use
>>> lmb. e820 is x86 specific BIOS wreckage and it's whole purpose is
>>> just to feed information into a (hopefully) generic early resource
>>> management facility.
>>>
>>> e820 _CANNOT_ be generalized. Period.
>
> I still want to know, what "need to keep e820" means for you.

keep the most arch/x86/kernel/e820.c, and later when finish_e820_parsing() is called,
fill lmb.memory according to e820 entries with E820_RAM type.



>
>>>> 2. use e820 range with RAM to fill lmb.memory when finizing_e820
>>>
>>> What's finizing_e820 ???
>> finish_e820_parsing();
>
> Yinghai, come on. Are you really expecting that everyone involved in
> this discussion goes to look up what the heck finish_e820_parsing()
> is doing ?
>
> You want to explain why your solution is better or why lmb is not
> sufficient, so you better go and explain what finish_e820_parsing()
> is, why finish_e820_parsing() is important and why lmb cannot cope
> with it.

current x86:
a. setup e820 array.
b. early_parm mem= and memmap= related code will adjust the e820.

we don't need to call lmb_enforce_memory_limit().

>
>>>> 3. use lmb.reserved to replace early_res.
>>>
>>> What's the implication of doing that ?
>>
>> early_res array is only corresponding to lmb.reserved, aka reserved
>> region from kernel.
>
> Is it only corresponding (somehow) or is it a full equivivalent ?

early_res is not sorted and merged.

>
>>>> current lmb is merging the region, we can not use name tag any more.
>>>
>>> What's wrong with merging of regions ? Are you arguing about a
>>> specific region ("the region") ?
>
> Care to answer my question ?
if range get merged, you can not use name with them.
>
>>>
>>> Which name tag ? And why is that name tag important ?
>>
>> struct early_res {
>> u64 start, end;
>> char name[15];
>> char overlap_ok;
>> };
>
> I'm starting to get annoyed, really. What is that name field for and
> why is that "name" field important ?

at least later when some code free a wrong range, we can figure who cause the problem.

>
>>>
>>>> may need to dump early_memtest, and use early_res for bootmem at
>>>> first.
>>>
>>> Why exactly might early_memtest not longer be possible ?
>>
>> early_memtest need to call find_e820_area_size
>> current lmb doesn't have that kind of find util.
>> the one memory subtract reserved memory by kernel.
>
> What subtracts what ? And why is it that hard to fix that ?

lmb.memory - lmb.reserved

or e820 E820_RAM entries - early_res

move some code from early_res to lmb.c?

>
>>>
>>> What means "early_res for bootmem" ?
>>
>> use early_res to replace bootmem, the CONFIG_NO_BOOTMEM.
>> that need early_res can be double or increase the slots automatically.
>
> -ENOPARSE
>
> Yinghai, I asked you to take your time and explain things in detail
> instead of shooting unparseable answers within a minute.
>
> Everyone else in this discussion tries to be as explanatory as
> possible, just you expect that everyone else is going to dig out the
> crystal ball to understand the deeper meanings of your patches.
>
> Again, please take your time to explain what needs to be done or what
> is impossible to solve in your opinion, so we can get that resolved in
> a way which is satisfactory and useful for all parties involved.

to make x86 to use lmb, we need to extend lmb to have find_early_area.

static int __init find_overlapped_early(u64 start, u64 end)
{
int i;
struct lmb_properties *r;

for (i = 0; i < lmb.reserved_cnt && lmb.reserved.region[i].size; i++) {
r = &lmb.reserved.region[i];
if (end > r->base && start < (r->base + r->size))
break;
}

return i;
}


/* Check for already reserved areas */
static inline int __init bad_addr(u64 *addrp, u64 size, u64 align)
{
int i;
u64 addr = *addrp;
int changed = 0;
struct lmb_properties *r;
again:
i = find_overlapped_early(addr, addr + size);
r = &lmb.reserved.region[i];
if (i < lmb.reserved.cnt && r->size) {
*addrp = addr = round_up(r->base + r->size, align);
changed = 1;
goto again;
}
return changed;
}

u64 __init find_early_area(u64 ei_start, u64 ei_last, u64 start, u64 end,
u64 size, u64 align)
{
u64 addr, last;

addr = round_up(ei_start, align);
if (addr < start)
addr = round_up(start, align);
if (addr >= ei_last)
goto out;
while (bad_addr(&addr, size, align) && addr+size <= ei_last)
;
last = addr + size;
if (last > ei_last)
goto out;
if (last > end)
goto out;

return addr;

out:
return -1ULL;
}

find_early_area_size()...

and use them we can have find_lmb_free_area

/*
* Find a free area with specified alignment in a specific range.
*/
u64 __init find_lmb_area(u64 start, u64 end, u64 size, u64 align)
{
int i;

for (i = 0; i < lmb.memory.cnt; i++) {
u64 ei_start = lmb.memory.region[i].base;
u64 ei_end = ei_start + lmb.memory.region[i].size;

addr = find_early_area(ei_start, ei_last, start, end,
size, align);

if (addr != -1ULL)
return addr;
}
return -1ULL;
}


also later we can use with active_range for bootmem replacement.

u64 __init find_memory_core_early(int nid, u64 size, u64 align,
u64 goal, u64 limit)
{
int i;

/* need to go over early_node_map to find out good range for node */
for_each_active_range_index_in_nid(i, nid) {
u64 addr;
u64 ei_start, ei_last;

ei_last = early_node_map[i].end_pfn;
ei_last <<= PAGE_SHIFT;
ei_start = early_node_map[i].start_pfn;
ei_start <<= PAGE_SHIFT;
addr = find_early_area(ei_start, ei_last,
goal, limit, size, align);

if (addr == -1ULL)
continue;

return addr;
}

return -1ULL;
}


Yinghai
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/