Re: [PATCH -mm] cgroup,cpuset: use alternative malloc to allocate large memory buf for tasks

From: Paul Menage
Date: Thu Sep 11 2008 - 15:55:35 EST


On Thu, Sep 11, 2008 at 9:45 AM, Paul Menage <menage@xxxxxxxxxx> wrote:
> On Thu, Sep 11, 2008 at 3:30 AM, Lai Jiangshan <laijs@xxxxxxxxxxxxxx> wrote:
>> This new alternative allocation implementation can allocate memory
>> up to 64M in 32bits system or 512M in 64bits system.
>
> Isn't a lot of this patch just reimplementing vmalloc()?

To extend on this, I think there are two ways of fixing the large
allocation problem:

1) just use vmalloc() rather than kmalloc() when the pid array is over
a certain threshold (probably 1 page?)

2) allocate pages/chunks in a similar way to your CL, but don't bother
mapping them. Instead we'd use the fact that each record (pid) is the
same size, and hence we can very easily use the high bits of an index
to select the chunk and the low bits to select the pid within the
chunk - no need to suffer the overhead of setting up and tearing down
ptes in order for the MMU do the same operation for us in hardware.

Obviously option 1 is a lot simpler, but option 2 avoids a
vmap()/vunmap() on every open/close of a tasks file. I'm not familiar
enough with the performance of vmap/vunmap on typical
hardware/workloads to know how high this overhead is - maybe a VM
guru can comment?

Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/