Re: [PATCH -v4 3/4] lib, Make gen_pool memory allocator lockless

From: Mathieu Desnoyers
Date: Fri Apr 29 2011 - 00:24:03 EST


* Huang Ying (ying.huang@xxxxxxxxx) wrote:
> On 04/29/2011 09:11 AM, Mathieu Desnoyers wrote:
> > * Huang Ying (ying.huang@xxxxxxxxx) wrote:
> >> Hi, Mathieu,
> >>
> >> Thanks for your comments.
> >>
> >> On 04/28/2011 10:37 PM, Mathieu Desnoyers wrote:
> >>> * Huang Ying (ying.huang@xxxxxxxxx) wrote:
> >> [snip]
> >>>>
> >>>> +/**
> >>>> + * gen_pool_for_each_chunk - iterate over chunks of generic memory pool
> >>>> + * @chunk: the struct gen_pool_chunk * to use as a loop cursor
> >>>> + * @pool: the generic memory pool
> >>>> + *
> >>>> + * Not lockless, proper mutual exclusion is needed to use this macro
> >>>> + * with other gen_pool function simultaneously.
> >>>> + */
> >>>> +#define gen_pool_for_each_chunk(chunk, pool) \
> >>>> + list_for_each_entry_rcu(chunk, &(pool)->chunks, next_chunk)
> >>>
> >>> Is it just me or this macro is never used ? Maybe you should consider
> >>> removing it.
> >>
> >> This macro is not used in this patch. But it is used in 4/4 of the
> >> patchset to free the backing pages before destroy the pool.
> >
> > Depending on how frequently you want to use it, you might want to use
> > list_for_each_entry_rcu directly rather than a macro wrapper. E.g. for
> > 2-3 uses, adding a macro just obfuscates the code IMHO (e.g. you don't
> > know it iterates on a RCU list by looking at the caller code).
>
> Yes. gen_pool_for_each_chunk() is not a good wrapper. I just don't want
> to expose too much implementation details to users, after all, we are
> working on library code. Maybe something like below is better?
>
> void gen_pool_for_each_chunk(struct gen_pool *pool, void (*func)(struct
> gen_pool *pool, struct gen_pool_chunk *chunk)) {
> rcu_read_lock();
> list_for_each_entry_rcu(chunk, &pool->chunks, next_chunk)
> func(pool, chunk);
> rcu_read_unlock();
> }

If it is expected to be exposed to other parts of the kernel, indeed we
should not expect the caller to magically know they must hold the rcu
read-side lock.

I'm not sure whether this iterator is necessary though. Just a comment
could suffice.

>
> >>
> >> [snip]
> >>>> @@ -108,43 +226,50 @@ EXPORT_SYMBOL(gen_pool_destroy);
> >>>> * @size: number of bytes to allocate from the pool
> >>>> *
> >>>> * Allocate the requested number of bytes from the specified pool.
> >>>> - * Uses a first-fit algorithm.
> >>>> + * Uses a first-fit algorithm. Can not be used in NMI handler on
> >>>> + * architectures without NMI-safe cmpxchg implementation.
> >>>> */
> >>>> unsigned long gen_pool_alloc(struct gen_pool *pool, size_t size)
> >>>> {
> >>>> - struct list_head *_chunk;
> >>>> struct gen_pool_chunk *chunk;
> >>>> - unsigned long addr, flags;
> >>>> + unsigned long addr;
> >>>> int order = pool->min_alloc_order;
> >>>> - int nbits, start_bit, end_bit;
> >>>> + int nbits, start_bit = 0, end_bit, remain;
> >>>> +
> >>>> +#ifndef CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG
> >>>> + BUG_ON(in_nmi());
> >>>> +#endif
> >>>>
> >>>> if (size == 0)
> >>>> return 0;
> >>>>
> >>>> nbits = (size + (1UL << order) - 1) >> order;
> >>>> -
> >>>> - read_lock(&pool->lock);
> >>>> - list_for_each(_chunk, &pool->chunks) {
> >>>> - chunk = list_entry(_chunk, struct gen_pool_chunk, next_chunk);
> >>>> + rcu_read_lock();
> >>>> + list_for_each_entry_rcu(chunk, &pool->chunks, next_chunk) {
> >>>> + if (size > atomic_read(&chunk->avail))
> >>>> + continue;
> >>>>
> >>>> end_bit = (chunk->end_addr - chunk->start_addr) >> order;
> >>>> -
> >>>> - spin_lock_irqsave(&chunk->lock, flags);
> >>>> - start_bit = bitmap_find_next_zero_area(chunk->bits, end_bit, 0,
> >>>> - nbits, 0);
> >>>> - if (start_bit >= end_bit) {
> >>>> - spin_unlock_irqrestore(&chunk->lock, flags);
> >>>> +retry:
> >>>> + start_bit = bitmap_find_next_zero_area(chunk->bits, end_bit,
> >>>> + start_bit, nbits, 0);
> >>>> + if (start_bit >= end_bit)
> >>>> continue;
> >>>> + remain = bitmap_set_ll(chunk->bits, start_bit, nbits);
> >>>> + if (remain) {
> >>>> + remain = bitmap_clear_ll(chunk->bits, start_bit,
> >>>> + nbits - remain);
> >>>> + BUG_ON(remain);
> >>>> + goto retry;
> >>>> }
> >>>>
> >>>> addr = chunk->start_addr + ((unsigned long)start_bit << order);
> >>>> -
> >>>> - bitmap_set(chunk->bits, start_bit, nbits);
> >>>> - spin_unlock_irqrestore(&chunk->lock, flags);
> >>>> - read_unlock(&pool->lock);
> >>>> + size = nbits << order;
> >>>> + atomic_sub(size, &chunk->avail);
> >>>> + rcu_read_unlock();
> >>>
> >>> I don't really like seeing a rcu_read_unlock() within a rcu list
> >>> iteration (even if it comes right before a "return"). Doing:
> >>>
> >>> unsigned long addr = 0;
> >>>
> >>> rcu_read_lock();
> >>> list_for_each_entry_rcu(chunk, &pool->chunks, next_chunk) {
> >>> if (...)
> >>> continue;
> >>> ...
> >>> addr = ...;
> >>> break;
> >>> }
> >>> rcu_read_unlock();
> >>> return addr;
> >>>
> >>> Would be more symmetric, and would remove one return path, which makes
> >>> the code easier to modify in the future.
> >>
> >> Unlock in loop is common in Linux kernel. Sometimes it makes code
> >> cleaner (but not always). Yes, for this case, we can avoid unlock in
> >> loop easily. But for the next case it is not so clean.
> >
> > See comment below,
> >
> >>
> >>>> return addr;
> >>>> }
> >>>> - read_unlock(&pool->lock);
> >>>> + rcu_read_unlock();
> >>>> return 0;
> >>>> }
> >>>> EXPORT_SYMBOL(gen_pool_alloc);
> >>>> @@ -155,33 +280,73 @@ EXPORT_SYMBOL(gen_pool_alloc);
> >>>> * @addr: starting address of memory to free back to pool
> >>>> * @size: size in bytes of memory to free
> >>>> *
> >>>> - * Free previously allocated special memory back to the specified pool.
> >>>> + * Free previously allocated special memory back to the specified
> >>>> + * pool. Can not be used in NMI handler on architectures without
> >>>> + * NMI-safe cmpxchg implementation.
> >>>> */
> >>>> void gen_pool_free(struct gen_pool *pool, unsigned long addr, size_t size)
> >>>> {
> >>>> - struct list_head *_chunk;
> >>>> struct gen_pool_chunk *chunk;
> >>>> - unsigned long flags;
> >>>> int order = pool->min_alloc_order;
> >>>> - int bit, nbits;
> >>>> + int start_bit, nbits, remain;
> >>>>
> >>>> - nbits = (size + (1UL << order) - 1) >> order;
> >>>> -
> >>>> - read_lock(&pool->lock);
> >>>> - list_for_each(_chunk, &pool->chunks) {
> >>>> - chunk = list_entry(_chunk, struct gen_pool_chunk, next_chunk);
> >>>> +#ifndef CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG
> >>>> + BUG_ON(in_nmi());
> >>>> +#endif
> >>>>
> >>>> + nbits = (size + (1UL << order) - 1) >> order;
> >
> > you could add:
> >
> > remain = nbits;
> >
> >>>> + rcu_read_lock();
> >>>> + list_for_each_entry_rcu(chunk, &pool->chunks, next_chunk) {
> >>>> if (addr >= chunk->start_addr && addr < chunk->end_addr) {
> >>>> BUG_ON(addr + size > chunk->end_addr);
> >>>> - spin_lock_irqsave(&chunk->lock, flags);
> >>>> - bit = (addr - chunk->start_addr) >> order;
> >>>> - while (nbits--)
> >>>> - __clear_bit(bit++, chunk->bits);
> >>>> - spin_unlock_irqrestore(&chunk->lock, flags);
> >>>> - break;
> >>>> + start_bit = (addr - chunk->start_addr) >> order;
> >
> > You could turn this:
> >
> >>>> + remain = bitmap_clear_ll(chunk->bits, start_bit, nbits);
> >>>> + BUG_ON(remain);
> >>>> + size = nbits << order;
> >>>> + atomic_add(size, &chunk->avail);
> >
> > into:
> >
> > remain = bitmap_clear_ll(chunk->bits, start_bit, nbits);
> > size = nbits << order;
> > atomic_add(size, &chunk->avail);
> > break;
> >
> >
> >>>> + rcu_read_unlock();
> >>>
> >>> Same comment as above apply here.
> >>
> >> It is harder to remove unlock in loop here. An extra variable should be
> >> used to indicate that something is freed from the pool. Do you think it
> >> is cleaner to just keep the unlock in loop here?
> >>
> >> Best Regards,
> >> Huang Ying
> >>
> >>> + return;
> >>> }
> >>> }
> >
> > And turn this:
> >
> >>> - BUG_ON(nbits > 0);
> >>> - read_unlock(&pool->lock);
> >>> + rcu_read_unlock();
> >>> + BUG();
> >
> > into:
> >
> > BUG_ON(remain);
> > rcu_read_unlock();
> >
> > Does that look OK to you ? On the plus side, you end up having a single
> > BUG_ON() in the function.
>
> I am afraid this make code a little harder to be understood. Why do you
> hate unlock in loop so much? It is common in kernel and I think most
> kernel developers are familiar with it.

I'm fine either way for this function, no strong opinion on this one.

Thanks,

Mathieu

>
> Best Regards,
> Huang Ying

--
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/