Re: [PATCH 2/6] ksm: dont allow overlap memory addressesregistrations.

From: Andrea Arcangeli
Date: Thu May 07 2009 - 06:49:05 EST


On Thu, May 07, 2009 at 08:55:47AM +0900, Minchan Kim wrote:
> Hmm. Don't you consider 32-bit system ?

Sorry I was too short, don't worry, I meant hugemem 32bit systems,
like 32G. If there's not much highmem, no problem can ever
happen. Just like pagetables had to be moved to highmem on 32G 32bit
systems to make them workable, KSM on those systems may generate lots
of lowmem and triggering early OOM conditions when allocating inodes
or other slab objects etc... and we don't plan to move those
rmap_items that represents physical pages by the chain of the virtual
addresses that maps them in highmem.

> Many embedded system is so I/O bouneded that we can use much CPU time in there.

Embedded systems with >4G of ram should run 64bit these days, so I
don't see a problem.

> I hope this feature will help saving memory in embedded system.

It will (assuming that there are apps that are duplicating anonymous
memory of course ;).

> One more thing about interface.
>
> Ksm map regions are dynamic characteritic ?
> I mean sometime A application calls ioctl(0x800000, 0x10000) and sometime it calls ioctl(0xb7000000, 0x20000);
> Of course, It depends on application's behavior.

Looks like the ioctl API is going away in favour of madvise so it'll
function like madvise, if you munmap the region the KSM registration
will go away.

> ex) echo 'pid 0x8050000 0x100000' > sysfs or procfs or cgroup.

This was answered by Chris, and surely this is feasible, as it is
feasible for kksmd to scan the whole system regardless of any
madvise. Some sysfs mangling should allow it.

However regardless of the highmem issue (this applies to 64bit systems
too) you've to keep in mind that for kksmd to keep track all pages
under scan it has to build rbtree and allocate rmap_items and
tree_items for each page tracked, those objects take some memory, so
if there's not much ram sharing you may waste more memory in the kksmd
allocations than in the amount of memory actually freed by KSM. This
is why it's better to selectively only register ranges that we know in
advance there's an high probability to free memory.

Thanks!
Andrea
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/