Re: memory resource accounting (was Re: [RFC, PATCH 0/5] Going forwardwith Resource Management - A cpu controller)

From: Kirill Korotaev
Date: Wed Aug 09 2006 - 09:42:17 EST


But if you have a unified struct page accounting, you don't need that.
You don't need struct radix_tree_node accounting, you don't need buffer_head
accounting, pagetable page accounting, vm_area_struct accounting, task_struct
accounting, etc etc in order to do your memory accounting if what you just
want to know is "who allocated what".
Sorry, are you suggesting to use page accounting for slab objects?
You mean, that if we can account page fractions then we can charge
part of slab page to object owners?
If this is correct, then I think it is ineffecient.
In our current implementation page beancounters can charge
only equal fraction of page to each owner, so it is not suitable for slabs.
Moreoever, it is easier to do correct accounting from the slab allocator
itself and with much less overhead.

And remember that if you have one container going crazy with inode/dentry
cache, it will get hit by its resource limit and end up having to reclaim
them or go OOM.
In order to decide which of the containers is crazy you need
to account correctly the amount of _pinned_ dcache memory.
And even to select correct container for OOM you need to have a corect accounting
of _pinned_ dcache.

Now you *may* want to split the actual accounting into kernel and user parts
if you're worried about obscure corner cases in kernel memory accounting. But
this would basically come for free when you have the GFP_EASYRECLAIM thingy
(at any rate, it is quite unintrusive).


Basically, what I have been hearing is that people want to be able to
surgically isolate the memory allocation of one container from that of
another. IMO this is simply infeasible (and exploit prone) to do it on a
per-kernel-object basis.
We have the following scheme:
cache which should be charged are marked as SLAB_UBC. the same for particular allocations,
we have a GFP_UBC flag specifing that the allocation should be charged to
the owner. Does it look good for you?
I will post kernel memory accounting patches soon here.

Thanks,
Kirill
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/