Re: [PATCH] memory hotplug disable boot option

From: Nathan Fontenot
Date: Tue Jun 29 2010 - 11:39:04 EST


On 06/28/2010 09:56 PM, KOSAKI Motohiro wrote:
>> On Mon, 2010-06-28 at 08:44 -0700, Greg KH wrote:
>>>> The directories being created are the standard directories, one for each of the memory
>>>> sections present at boot. I think the most used files in each of these directories
>>>> is the state and removable file used to do memory hotplug.
>>>
>>> And perhaps we shouldn't really be creating so many directories? Why
>>> not work with the memory hotplug developers to change their interface to
>>> not abuse sysfs in such a manner?
>>
>> Heh, it wasn't abuse until we got this much memory. But, I think this
>> one is pretty much 100% my fault.
>>
>> Nathan, I think the right fix here is probably to untie sysfs from the
>> sections a bit. We should be able to have sysfs dirs that represent
>> more than one contiguous SECTION_SIZE area of memory.
>
> Why do we need abi breakage? Yourself talked about we guess ppc don't
> actually need 16MB section. I think IBM folks have to confirm it.
> If our guessing is correct, the firmware fixing is only necessary.

Yes, ppc still needs to support add/remove of 16MB sections. This correlates
to the smallest lmb size on ppc that we need to support.

>
> Thats said, I don't 100% refuse your idea. it's interesting. but,
> In generical I hate _unncessary_ abi change.

Me too, but I'm not sure the current sysfs layout of memory scales well
for machines with huge amounts of memory.

How about providing an alternate sysfs layout for systems that have a large
number of memory sections? Even on the machines I worked with that have
1 and 2 TB of memory, if we increase the memory sections size to equal the
lmb size we still would be creating 6k+ directories for a 1 TB machine.
This would alleviate much of the perfomrance issue but still leaves us with
a directory of thousands (or tens of thousands for really big systems)
of memoryXXX subdirectories, which is not really human readable.

Or some method of having a single memory XXX dir represent multiple sections,
as Dave suggested would work. Perhaps there is a way to subdivide the
memory section dirs into separate dirs based on their node.

At the point of dealing with this many memory sections would it make sense
to not create directories for each of the memory sections? Perhaps just
files to report information about the memory sections.

-Nathan

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/