Re: [Lse-tech] Re: [RFC] Dynamic percpu data allocator

From: Dipankar Sarma (dipankar@in.ibm.com)
Date: Thu May 30 2002 - 12:55:13 EST


On Thu, May 30, 2002 at 08:56:36AM -0500, Mala Anand wrote:
>
> dipankar@beaverton.ibm.co
> m To: BALBIR SINGH <balbir.singh@wipro.com>
>
> >The per-cpu data allocator allocates one copy for *each* CPU.
> >It uses the slab allocator underneath. Eventually, when/if we have
> >per-cpu/numa-node slab allocation, the per-cpu data allocator
> >can allocate every CPU's copy from memory closest to it.
>
> Does this mean that memory allocation will happen in "each" CPU?
> Do slab allocator allocate the memory in each cpu? Your per-cpu
> data allocator sounds like the hot list skbs that are in the tcpip stack
> in the sense it is one level above the slab allocator and the list is
> kept per cpu. If slab allocator is fixed for per cpu, do you still
> need this per-cpu data allocator?

Actually I don't know for sure what plans are afoot to fix the slab allocator
for per-cpu. One plan I heard about was allocating from per-cpu pools
rather than per-cpu copies. My requirements are similar to
the hot list skbs. I want to do this -

        int *ctrp1, *ctrp2;
        
        ctrp1 = kmalloc_percpu(sizeof(*ctrp1), GFP_ATOMIC);
        if (ctrp1 == NULL) {
                /* recover */
        }
        ctrp2 = kmalloc_percpu(sizeof(*ctrp2), GFP_ATOMIC);
        if (ctrp2 == NULL) {
                /* recover */
        }

        *per_cpu_ptr(ctrp1, smp_processor_id())++;
        this_cpu_ptr(ctrp2)++;

Now I can allocate by making ctrp1/ctrp2 point to an array
of NR_CPUS and kmalloc() memory for each CPU's copy of the
int. This is simple and will work.

        void **ptrs = kmalloc(sizeof(*ptrs) * NR_CPUS, flags);

        if (!ptrs) return NULL;
        for (i = 0; i < NR_CPUS; i++) {
              ptrs[i] = kmalloc(size, flags);
              if (!ptrs[i])
                      goto unwind_oom;
        }

However I would like to use kmalloc_percpu() for allocating very
small objects - typlically integer counters or small structures
to be used as per-cpu counters for things like dst entries and dentries.
kmalloc will waste the rest of the cache line for such small objects.
The alternative is to use a layer of code to interleave small objects
and save on space.

   CPU #0 CPU#1

 --------- --------- Start of cache line
   *ctrp1 *ctrp1
   *ctrp2 *ctrp2

   . .
   . .
   . .
   . .
   . .

 --------- ---------- End of cache line

I have an allocator that interleaves objects like this if they can be fitted
into size that is a factor of SMP_CACHE_BYTES.

I hope someone can tell me that I don't even have to do this. Otherwise
I will go ahead and do my thing.

Thanks

-- 
Dipankar Sarma  <dipankar@in.ibm.com> http://lse.sourceforge.net
Linux Technology Center, IBM Software Lab, Bangalore, India.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Fri May 31 2002 - 22:00:29 EST