Re: [PATCH 2 of 4] hotplug-memory: adding non-section-aligned memoryis bad

From: Jeremy Fitzhardinge
Date: Thu Mar 27 2008 - 22:51:55 EST


KAMEZAWA Hiroyuki wrote:
On Thu, 27 Mar 2008 19:13:20 -0700
Jeremy Fitzhardinge <jeremy@xxxxxxxx> wrote:

Because, firmware may occupy some area in the section.
Firmware must exclude those area to notify kernel. So, E820, EFI,
or _CRS of ACPI may return not aligned address and size.
register_memory_resource() and walk_memory_resource() are to skip
them silently. This is intended.

Ah, ok. sorry.

Jeremy, I think you can check whether you have 'struct page' or not by
pfn_valid().

If pfn_valid() == false, you should call add_memory() and create
a section/mem_map. If pfn_valid() == true, you should just remove
PG_reserved bit in mem_map by online_page().
OK. Would that ever be necessary if I explicitly align my start and size?

Maybe no. but be carefull not to register resource in overlapped manner.

Yes. That's why I added add_memory_resource(), so I could use allocate_resource() to find a non-overlapping range to put the new memory.

(I wrote online_page() in above, but online_pages() is maybe better.
It does all what you want.)

No, for my use-case the pages must be onlined one by one as they get some physical memory assigned to them. At the time I do add_memory(), I'm just allocating page structures, but there's no memory backing that range.

That's why I need to disable the sysfs onlining interface, because it bulk onlines the pages before there's anything behind them.

Start/Size are automatically alined to section in __add_pages().

See below.
==
110 int __add_pages(struct zone *zone, unsigned long phys_start_pfn,
111 unsigned long nr_pages)
112 {
113 unsigned long i;
114 int err = 0;
115 int start_sec, end_sec;
116 /* during initialize mem_map, align hot-added range to section */
117 start_sec = pfn_to_section_nr(phys_start_pfn);
118 end_sec = pfn_to_section_nr(phys_start_pfn + nr_pages - 1);
==

And online_pages(), which onlines pages in [pfn, pfn + size), will see
registerred resources within [pfn, pfn + size).
==
184 int online_pages(unsigned long pfn, unsigned long nr_pages)
<snip>
227 walk_memory_resource(pfn, nr_pages, &onlined_pages,
228 online_pages_range);
==

One of my concern is how-to-handle sysfs status in this case.

Another concerns is, currently, I think no one tried to online a section twice
to online reserved pages in a section. so, you may see bug.
For example, mem_notify() in online_pages() will be called several times against
a section.

I'd really rather prevent online_pages from happening at all, since it can only cause havoc.

Thanks,
J
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/