Re: [PATCH 0/9] Scalability requirements for sysv ipc - v3

From: Paul E. McKenney
Date: Fri May 30 2008 - 04:22:56 EST


On Wed, May 07, 2008 at 01:35:53PM +0200, Nadia.Derbey@xxxxxxxx wrote:
>
> After scalability problems have been detected when using the sysV ipcs, I
> have proposed to use an RCU based implementation of the IDR api instead (see
> threads http://lkml.org/lkml/2008/4/11/212 and
> http://lkml.org/lkml/2008/4/29/295).
>
> This resulted in many people asking to convert the idr API and make it
> rcu safe (because most of the code was duplicated and thus unmaintanable
> and unreviewable).
>
> So here is a first attempt.
>
> The important change wrt to the idr API itself is during idr removes:
> idr layers are freed after a grace period, instead of being moved to the
> free list.
>
> The important change wrt to ipcs, is that idr_find() can now be called
> locklessly inside a rcu read critical section.
>
> Here are the results I've got for the pmsg test sent by Manfred:
>
> 2.6.25-rc3-mm1 2.6.25-rc3-mm1+ 2.6.25-mm1 Patched 2.6.25-mm1
> 1 1168441 1064021 876000 947488
> 2 1094264 921059 1549592 1730685
> 3 2082520 1738165 1694370 2324880
> 4 2079929 1695521 404553 2400408
> 5 2898758 406566 391283 3246580
> 6 2921417 261275 263249 3752148
> 7 3308761 126056 191742 4243142
> 8 3329456 100129 141722 4275780
>
> 1st column: stock 2.6.25-rc3-mm1
> 2nd column: 2.6.25-rc3-mm1 + ipc patches (store ipcs into idrs)
> 3nd column: stock 2.6.25-mm1
> 4th column: 2.6.25-mm1 + this pacth series.
>
> I'll send a chart as an answer to this mail: don't know how to do that
> with quilt :-(
>
>
> Reviewers are more than ever welcome!
>
> Patches should be applied on linux-2.6.25-mm1, in the following order:
>
> [ PATCH 01/09 ] : idr_add_rcu_head.patch
> [ PATCH 02/09 ] : idr_rename_routines.patch
> [ PATCH 03/09 ] : idr_fix_printk.patch
> [ PATCH 04/09 ] : idr_rc_to_errno.patch
> [ PATCH 05/09 ] : idr_get_new_rcu_safe.patch
> [ PATCH 06/09 ] : idr_find_rcu_safe.patch
> [ PATCH 07/09 ] : idr_remove_rcu_safe.patch
> [ PATCH 08/09 ] : ipc_fix_ipc_lock.patch
> [ PATCH 09/09 ] : remove_ipc_lock_down.patch
>
> Patches 2, 3 and 4 do not introduce actual changes.
>
> I won't be available before next Tuesday, so, please, don't be mad at me if
> I'm not answering fast enough.

I guess in my case, next Tuesday was not an issue. :-/

Anyway, the idr.c changes look good to me. Not sure why you are using
INIT_RCU_HEAD() given that call_rcu() completely initializes the fields.
Using INIT_RCU_HEAD() doesn't cause any problems, but does add needless
code.

Commentary below, looks good from an RCU viewpoint.

Thanx, Paul

> /*
> * 2002-10-18 written by Jim Houston jim.houston@xxxxxxxx
> * Copyright (C) 2002 by Concurrent Computer Corporation
> * Distributed under the GNU GPL license version 2.
> *
> * Modified by George Anzinger to reuse immediately and to use
> * find bit instructions. Also removed _irq on spinlocks.
> *
> * Modified by Nadia Derbey to make it RCU safe.
> *
> * Small id to pointer translation service.
> *
> * It uses a radix tree like structure as a sparse array indexed
> * by the id to obtain the pointer. The bitmap makes allocating
> * a new id quick.
> *
> * You call it to allocate an id (an int) an associate with that id a
> * pointer or what ever, we treat it as a (void *). You can pass this
> * id to a user for him to pass back at a later time. You then pass
> * that id to this code and it returns your pointer.
>
> * You can release ids at any time. When all ids are released, most of
> * the memory is returned (we keep IDR_FREE_MAX) in a local pool so we
> * don't need to go to the memory "store" during an id allocate, just
> * so you don't need to be too concerned about locking and conflicts
> * with the slab allocator.
> */
>
> #ifndef TEST // to test in user space...
> #include <linux/slab.h>
> #include <linux/init.h>
> #include <linux/module.h>
> #endif
> #include <linux/err.h>
> #include <linux/string.h>
> #include <linux/idr.h>
>
> static struct kmem_cache *idr_layer_cache;
>
> static struct idr_layer *get_from_free_list(struct idr *idp)
> {
> struct idr_layer *p;
> unsigned long flags;
>
> spin_lock_irqsave(&idp->lock, flags);
> if ((p = idp->id_free)) {
> idp->id_free = p->ary[0];
> idp->id_free_cnt--;
> p->ary[0] = NULL;

OK, this is the freelist which is inaccessible to readers.

> }
> spin_unlock_irqrestore(&idp->lock, flags);
> return(p);
> }
>
> static void idr_layer_rcu_free(struct rcu_head *head)
> {
> struct idr_layer *layer;
>
> layer = container_of(head, struct idr_layer, rcu_head);
> kmem_cache_free(idr_layer_cache, layer);
> }
>
> static inline void free_layer(struct idr_layer *p)
> {
> call_rcu(&p->rcu_head, idr_layer_rcu_free);
> }
>
> /* only called when idp->lock is held */
> static void __move_to_free_list(struct idr *idp, struct idr_layer *p)
> {
> p->ary[0] = idp->id_free;

OK, this is the freelist which is inaccessible to readers.

> idp->id_free = p;
> idp->id_free_cnt++;
> }
>
> static void move_to_free_list(struct idr *idp, struct idr_layer *p)
> {
> unsigned long flags;
>
> /*
> * Depends on the return element being zeroed.
> */
> spin_lock_irqsave(&idp->lock, flags);
> __move_to_free_list(idp, p);
> spin_unlock_irqrestore(&idp->lock, flags);
> }
>
> static void idr_mark_full(struct idr_layer **pa, int id)
> {
> struct idr_layer *p = pa[0];
> int l = 0;
>
> __set_bit(id & IDR_MASK, &p->bitmap);
> /*
> * If this layer is full mark the bit in the layer above to
> * show that this part of the radix tree is full. This may
> * complete the layer above and require walking up the radix
> * tree.
> */
> while (p->bitmap == IDR_FULL) {
> if (!(p = pa[++l]))
> break;
> id = id >> IDR_BITS;
> __set_bit((id & IDR_MASK), &p->bitmap);
> }
> }
>
> /**
> * idr_pre_get - reserver resources for idr allocation
> * @idp: idr handle
> * @gfp_mask: memory allocation flags
> *
> * This function should be called prior to locking and calling the
> * idr_get_new* functions. It preallocates enough memory to satisfy
> * the worst possible allocation.
> *
> * If the system is REALLY out of memory this function returns 0,
> * otherwise 1.
> */
> int idr_pre_get(struct idr *idp, gfp_t gfp_mask)
> {
> while (idp->id_free_cnt < IDR_FREE_MAX) {
> struct idr_layer *new;
> new = kmem_cache_alloc(idr_layer_cache, gfp_mask);
> if (new == NULL)
> return (0);
> move_to_free_list(idp, new);
> }
> return 1;
> }
> EXPORT_SYMBOL(idr_pre_get);
>
> static int sub_alloc(struct idr *idp, int *starting_id, struct idr_layer **pa)
> {
> int n, m, sh;
> struct idr_layer *p, *new;
> int l, id, oid;
> unsigned long bm;
>
> id = *starting_id;
> restart:
> p = idp->top;

OK, the caller presumably holds an update-side lock.

> l = idp->layers;
> pa[l--] = NULL;
> while (1) {
> /*
> * We run around this while until we reach the leaf node...
> */
> n = (id >> (IDR_BITS*l)) & IDR_MASK;
> bm = ~p->bitmap;
> m = find_next_bit(&bm, IDR_SIZE, n);
> if (m == IDR_SIZE) {
> /* no space available go back to previous layer. */
> l++;
> oid = id;
> id = (id | ((1 << (IDR_BITS * l)) - 1)) + 1;
>
> /* if already at the top layer, we need to grow */
> if (!(p = pa[l])) {
> *starting_id = id;
> return IDR_NEED_TO_GROW;
> }
>
> /* If we need to go up one layer, continue the
> * loop; otherwise, restart from the top.
> */
> sh = IDR_BITS * (l + 1);
> if (oid >> sh == id >> sh)
> continue;
> else
> goto restart;
> }
> if (m != n) {
> sh = IDR_BITS*l;
> id = ((id >> sh) ^ n ^ m) << sh;
> }
> if ((id >= MAX_ID_BIT) || (id < 0))
> return IDR_NOMORE_SPACE;
> if (l == 0)
> break;
> /*
> * Create the layer below if it is missing.
> */
> if (!p->ary[m]) {

OK, we aren't dereferencing. Besides, we should hold the update-side
lock at this point.

> new = get_from_free_list(idp);
> if (!new)
> return -1;
> INIT_RCU_HEAD(&new->rcu_head);

Not needed, unless you want this zeroed for debug purposes.

> rcu_assign_pointer(p->ary[m], new);
> p->count++;
> }
> pa[l--] = p;
> p = p->ary[m];

Holding update-side lock.

> }
>
> pa[l] = p;
> return id;
> }
>
> static int idr_get_empty_slot(struct idr *idp, int starting_id,
> struct idr_layer **pa)
> {
> struct idr_layer *p, *new;
> int layers, v, id;
> unsigned long flags;
>
> id = starting_id;
> build_up:
> p = idp->top;

OK, the caller presumably holds an update-side lock.

> layers = idp->layers;
> if (unlikely(!p)) {
> if (!(p = get_from_free_list(idp)))
> return -1;
> INIT_RCU_HEAD(&p->rcu_head);

Not needed, unless you want this zeroed for debug purposes.

> layers = 1;
> }
> /*
> * Add a new layer to the top of the tree if the requested
> * id is larger than the currently allocated space.
> */
> while ((layers < (MAX_LEVEL - 1)) && (id >= (1 << (layers*IDR_BITS)))) {
> layers++;
> if (!p->count)
> continue;
> if (!(new = get_from_free_list(idp))) {
> /*
> * The allocation failed. If we built part of
> * the structure tear it down.
> */
> spin_lock_irqsave(&idp->lock, flags);
> for (new = p; p && p != idp->top; new = p) {
> p = p->ary[0];
> new->ary[0] = NULL;

OK, this presumably has not yet been made accessible to readers.

> new->bitmap = new->count = 0;
> __move_to_free_list(idp, new);
> }
> spin_unlock_irqrestore(&idp->lock, flags);
> return -1;
> }
> new->ary[0] = p;

OK, this presumably has not yet been made accessible to readers.

> new->count = 1;
> INIT_RCU_HEAD(&new->rcu_head);

Not needed, unless you want this zeroed for debug purposes.

> if (p->bitmap == IDR_FULL)
> __set_bit(0, &new->bitmap);
> p = new;
> }
> rcu_assign_pointer(idp->top, p);
> idp->layers = layers;
> v = sub_alloc(idp, &id, pa);
> if (v == IDR_NEED_TO_GROW)
> goto build_up;
> return(v);
> }
>
> static int idr_get_new_above_int(struct idr *idp, void *ptr, int starting_id)
> {
> struct idr_layer *pa[MAX_LEVEL];
> int id;
>
> id = idr_get_empty_slot(idp, starting_id, pa);
> if (id >= 0) {
> /*
> * Successfully found an empty slot. Install the user
> * pointer and mark the slot full.
> */
> rcu_assign_pointer(pa[0]->ary[id & IDR_MASK],
> (struct idr_layer *)ptr);
> pa[0]->count++;
> idr_mark_full(pa, id);
> }
>
> return id;
> }
>
> /**
> * idr_get_new_above - allocate new idr entry above or equal to a start id
> * @idp: idr handle
> * @ptr: pointer you want associated with the ide
> * @start_id: id to start search at
> * @id: pointer to the allocated handle
> *
> * This is the allocate id function. It should be called with any
> * required locks.
> *
> * If memory is required, it will return -EAGAIN, you should unlock
> * and go back to the idr_pre_get() call. If the idr is full, it will
> * return -ENOSPC.
> *
> * @id returns a value in the range 0 ... 0x7fffffff
> */
> int idr_get_new_above(struct idr *idp, void *ptr, int starting_id, int *id)
> {
> int rv;
>
> rv = idr_get_new_above_int(idp, ptr, starting_id);
> /*
> * This is a cheap hack until the IDR code can be fixed to
> * return proper error values.
> */
> if (rv < 0)
> return _idr_rc_to_errno(rv);
> *id = rv;
> return 0;
> }
> EXPORT_SYMBOL(idr_get_new_above);
>
> /**
> * idr_get_new - allocate new idr entry
> * @idp: idr handle
> * @ptr: pointer you want associated with the ide
> * @id: pointer to the allocated handle
> *
> * This is the allocate id function. It should be called with any
> * required locks.
> *
> * If memory is required, it will return -EAGAIN, you should unlock
> * and go back to the idr_pre_get() call. If the idr is full, it will
> * return -ENOSPC.
> *
> * @id returns a value in the range 0 ... 0x7fffffff
> */
> int idr_get_new(struct idr *idp, void *ptr, int *id)
> {
> int rv;
>
> rv = idr_get_new_above_int(idp, ptr, 0);
> /*
> * This is a cheap hack until the IDR code can be fixed to
> * return proper error values.
> */
> if (rv < 0)
> return _idr_rc_to_errno(rv);
> *id = rv;
> return 0;
> }
> EXPORT_SYMBOL(idr_get_new);
>
> static void idr_remove_warning(int id)
> {
> printk(KERN_WARNING
> "idr_remove called for id=%d which is not allocated.\n", id);
> dump_stack();
> }
>
> static void sub_remove(struct idr *idp, int shift, int id)
> {
> struct idr_layer *p = idp->top;

OK, the caller presumably holds an update-side lock.

> struct idr_layer **pa[MAX_LEVEL];
> struct idr_layer ***paa = &pa[0];
> struct idr_layer *to_free;
> int n;
>
> *paa = NULL;
> *++paa = &idp->top;
>
> while ((shift > 0) && p) {
> n = (id >> shift) & IDR_MASK;
> __clear_bit(n, &p->bitmap);
> *++paa = &p->ary[n];

OK, the caller presumably holds an update-side lock.

> p = p->ary[n];
> shift -= IDR_BITS;
> }
> n = id & IDR_MASK;
> if (likely(p != NULL && test_bit(n, &p->bitmap))){
> __clear_bit(n, &p->bitmap);
> rcu_assign_pointer(p->ary[n], NULL);
> to_free = NULL;
> while(*paa && ! --((**paa)->count)){
> if (to_free)
> free_layer(to_free);
> to_free = **paa;
> **paa-- = NULL;
> }
> if (!*paa)
> idp->layers = 0;
> if (to_free)
> free_layer(to_free);
> } else
> idr_remove_warning(id);
> }
>
> /**
> * idr_remove - remove the given id and free it's slot
> * @idp: idr handle
> * @id: unique key
> */
> void idr_remove(struct idr *idp, int id)
> {
> struct idr_layer *p;
> struct idr_layer *to_free;
>
> /* Mask off upper bits we don't use for the search. */
> id &= MAX_ID_MASK;
>
> sub_remove(idp, (idp->layers - 1) * IDR_BITS, id);
> if (idp->top && idp->top->count == 1 && (idp->layers > 1) &&
> idp->top->ary[0]) {

OK, the caller presumably holds the update-side lock.

> /*
> * Single child at leftmost slot: we can shrink the tree.
> * This level is not needed anymore since when layers are
> * inserted, they are inserted at the top of the existing
> * tree.
> */
> to_free = idp->top;
> p = idp->top->ary[0];

OK, the caller presumably holds the update-side lock.

> rcu_assign_pointer(idp->top, p);
> --idp->layers;
> to_free->bitmap = to_free->count = 0;
> free_layer(to_free);
> }
> while (idp->id_free_cnt >= IDR_FREE_MAX) {
> p = get_from_free_list(idp);
> /*
> * Note: we don't call the rcu callback here, since the only
> * layers that fall into the freelist are those that have been
> * preallocated.
> */
> kmem_cache_free(idr_layer_cache, p);
> }
> return;
> }
> EXPORT_SYMBOL(idr_remove);
>
> /**
> * idr_remove_all - remove all ids from the given idr tree
> * @idp: idr handle
> *
> * idr_destroy() only frees up unused, cached idp_layers, but this
> * function will remove all id mappings and leave all idp_layers
> * unused.
> *
> * A typical clean-up sequence for objects stored in an idr tree, will
> * use idr_for_each() to free all objects, if necessay, then
> * idr_remove_all() to remove all ids, and idr_destroy() to free
> * up the cached idr_layers.
> */
> void idr_remove_all(struct idr *idp)
> {
> int n, id, max;
> struct idr_layer *p;
> struct idr_layer *pa[MAX_LEVEL];
> struct idr_layer **paa = &pa[0];
>
> n = idp->layers * IDR_BITS;
> p = idp->top;

OK, the caller presumably holds an update-side lock.

> max = 1 << n;
>
> id = 0;
> while (id < max) {
> while (n > IDR_BITS && p) {
> n -= IDR_BITS;
> *paa++ = p;
> p = p->ary[(id >> n) & IDR_MASK];

OK, the caller presumably holds the update-side lock.

> }
>
> id += 1 << n;
> while (n < fls(id)) {
> if (p)
> free_layer(p);
> n += IDR_BITS;
> p = *--paa;
> }
> }
> rcu_assign_pointer(idp->top, NULL);
> idp->layers = 0;
> }
> EXPORT_SYMBOL(idr_remove_all);
>
> /**
> * idr_destroy - release all cached layers within an idr tree
> * idp: idr handle
> */
> void idr_destroy(struct idr *idp)
> {
> while (idp->id_free_cnt) {
> struct idr_layer *p = get_from_free_list(idp);
> kmem_cache_free(idr_layer_cache, p);
> }
> }
> EXPORT_SYMBOL(idr_destroy);
>
> /**
> * idr_find - return pointer for given id
> * @idp: idr handle
> * @id: lookup key
> *
> * Return the pointer given the id it has been registered with. A %NULL
> * return indicates that @id is not valid or you passed %NULL in
> * idr_get_new().
> *
> * This function can be called under rcu_read_lock(), given that the leaf
> * pointers lifetimes are correctly managed.
> */
> void *idr_find(struct idr *idp, int id)
> {
> int n;
> struct idr_layer *p;
>
> n = idp->layers * IDR_BITS;
> p = rcu_dereference(idp->top);
>
> /* Mask off upper bits we don't use for the search. */
> id &= MAX_ID_MASK;
>
> if (id >= (1 << n))
> return NULL;
>
> while (n > 0 && p) {
> n -= IDR_BITS;
> p = rcu_dereference(p->ary[(id >> n) & IDR_MASK]);
> }
> return((void *)p);
> }
> EXPORT_SYMBOL(idr_find);
>
> /**
> * idr_for_each - iterate through all stored pointers
> * @idp: idr handle
> * @fn: function to be called for each pointer
> * @data: data passed back to callback function
> *
> * Iterate over the pointers registered with the given idr. The
> * callback function will be called for each pointer currently
> * registered, passing the id, the pointer and the data pointer passed
> * to this function. It is not safe to modify the idr tree while in
> * the callback, so functions such as idr_get_new and idr_remove are
> * not allowed.
> *
> * We check the return of @fn each time. If it returns anything other
> * than 0, we break out and return that value.
> *
> * The caller must serialize idr_for_each() vs idr_get_new() and idr_remove().
> */
> int idr_for_each(struct idr *idp,
> int (*fn)(int id, void *p, void *data), void *data)
> {
> int n, id, max, error = 0;
> struct idr_layer *p;
> struct idr_layer *pa[MAX_LEVEL];
> struct idr_layer **paa = &pa[0];
>
> n = idp->layers * IDR_BITS;
> p = rcu_dereference(idp->top);
> max = 1 << n;
>
> id = 0;
> while (id < max) {
> while (n > 0 && p) {
> n -= IDR_BITS;
> *paa++ = p;
> p = rcu_dereference(p->ary[(id >> n) & IDR_MASK]);
> }
>
> if (p) {
> error = fn(id, (void *)p, data);
> if (error)
> break;
> }
>
> id += 1 << n;
> while (n < fls(id)) {
> n += IDR_BITS;
> p = *--paa;
> }
> }
>
> return error;
> }
> EXPORT_SYMBOL(idr_for_each);
>
> /**
> * idr_replace - replace pointer for given id
> * @idp: idr handle
> * @ptr: pointer you want associated with the id
> * @id: lookup key
> *
> * Replace the pointer registered with an id and return the old value.
> * A -ENOENT return indicates that @id was not found.
> * A -EINVAL return indicates that @id was not within valid constraints.
> *
> * The caller must serialize with writers.
> */
> void *idr_replace(struct idr *idp, void *ptr, int id)
> {
> int n;
> struct idr_layer *p, *old_p;
>
> n = idp->layers * IDR_BITS;
> p = idp->top;

OK, the caller presumably holds an update-side lock.

>
> id &= MAX_ID_MASK;
>
> if (id >= (1 << n))
> return ERR_PTR(-EINVAL);
>
> n -= IDR_BITS;
> while ((n > 0) && p) {
> p = p->ary[(id >> n) & IDR_MASK];

OK, the caller presumably holds the update-side lock.

> n -= IDR_BITS;
> }
>
> n = id & IDR_MASK;
> if (unlikely(p == NULL || !test_bit(n, &p->bitmap)))
> return ERR_PTR(-ENOENT);
>
> old_p = p->ary[n];

OK, the caller presumably holds the update-side lock.

> rcu_assign_pointer(p->ary[n], ptr);
>
> return old_p;
> }
> EXPORT_SYMBOL(idr_replace);
>
> static void idr_cache_ctor(struct kmem_cache *idr_layer_cache, void *idr_layer)
> {
> memset(idr_layer, 0, sizeof(struct idr_layer));
> }
>
> void __init idr_init_cache(void)
> {
> idr_layer_cache = kmem_cache_create("idr_layer_cache",
> sizeof(struct idr_layer), 0, SLAB_PANIC,
> idr_cache_ctor);
> }
>
> /**
> * idr_init - initialize idr handle
> * @idp: idr handle
> *
> * This function is use to set up the handle (@idp) that you will pass
> * to the rest of the functions.
> */
> void idr_init(struct idr *idp)
> {
> memset(idp, 0, sizeof(struct idr));
> spin_lock_init(&idp->lock);
> }
> EXPORT_SYMBOL(idr_init);
>
>
> /*
> * IDA - IDR based ID allocator
> *
> * this is id allocator without id -> pointer translation. Memory
> * usage is much lower than full blown idr because each id only
> * occupies a bit. ida uses a custom leaf node which contains
> * IDA_BITMAP_BITS slots.
> *
> * 2007-04-25 written by Tejun Heo <htejun@xxxxxxxxx>
> */
>
> static void free_bitmap(struct ida *ida, struct ida_bitmap *bitmap)
> {
> unsigned long flags;
>
> if (!ida->free_bitmap) {
> spin_lock_irqsave(&ida->idr.lock, flags);
> if (!ida->free_bitmap) {
> ida->free_bitmap = bitmap;
> bitmap = NULL;
> }
> spin_unlock_irqrestore(&ida->idr.lock, flags);
> }
>
> kfree(bitmap);
> }
>
> /**
> * ida_pre_get - reserve resources for ida allocation
> * @ida: ida handle
> * @gfp_mask: memory allocation flag
> *
> * This function should be called prior to locking and calling the
> * following function. It preallocates enough memory to satisfy the
> * worst possible allocation.
> *
> * If the system is REALLY out of memory this function returns 0,
> * otherwise 1.
> */
> int ida_pre_get(struct ida *ida, gfp_t gfp_mask)
> {
> /* allocate idr_layers */
> if (!idr_pre_get(&ida->idr, gfp_mask))
> return 0;
>
> /* allocate free_bitmap */
> if (!ida->free_bitmap) {
> struct ida_bitmap *bitmap;
>
> bitmap = kmalloc(sizeof(struct ida_bitmap), gfp_mask);
> if (!bitmap)
> return 0;
>
> free_bitmap(ida, bitmap);
> }
>
> return 1;
> }
> EXPORT_SYMBOL(ida_pre_get);
>
> /**
> * ida_get_new_above - allocate new ID above or equal to a start id
> * @ida: ida handle
> * @staring_id: id to start search at
> * @p_id: pointer to the allocated handle
> *
> * Allocate new ID above or equal to @ida. It should be called with
> * any required locks.
> *
> * If memory is required, it will return -EAGAIN, you should unlock
> * and go back to the ida_pre_get() call. If the ida is full, it will
> * return -ENOSPC.
> *
> * @p_id returns a value in the range 0 ... 0x7fffffff.
> */
> int ida_get_new_above(struct ida *ida, int starting_id, int *p_id)
> {
> struct idr_layer *pa[MAX_LEVEL];
> struct ida_bitmap *bitmap;
> unsigned long flags;
> int idr_id = starting_id / IDA_BITMAP_BITS;
> int offset = starting_id % IDA_BITMAP_BITS;
> int t, id;
>
> restart:
> /* get vacant slot */
> t = idr_get_empty_slot(&ida->idr, idr_id, pa);
> if (t < 0)
> return _idr_rc_to_errno(t);
>
> if (t * IDA_BITMAP_BITS >= MAX_ID_BIT)
> return -ENOSPC;
>
> if (t != idr_id)
> offset = 0;
> idr_id = t;
>
> /* if bitmap isn't there, create a new one */
> bitmap = (void *)pa[0]->ary[idr_id & IDR_MASK];

OK, the caller presumably holds the update-side lock.

> if (!bitmap) {
> spin_lock_irqsave(&ida->idr.lock, flags);
> bitmap = ida->free_bitmap;
> ida->free_bitmap = NULL;
> spin_unlock_irqrestore(&ida->idr.lock, flags);
>
> if (!bitmap)
> return -EAGAIN;
>
> memset(bitmap, 0, sizeof(struct ida_bitmap));
> rcu_assign_pointer(pa[0]->ary[idr_id & IDR_MASK],
> (void *)bitmap);
> pa[0]->count++;
> }
>
> /* lookup for empty slot */
> t = find_next_zero_bit(bitmap->bitmap, IDA_BITMAP_BITS, offset);
> if (t == IDA_BITMAP_BITS) {
> /* no empty slot after offset, continue to the next chunk */
> idr_id++;
> offset = 0;
> goto restart;
> }
>
> id = idr_id * IDA_BITMAP_BITS + t;
> if (id >= MAX_ID_BIT)
> return -ENOSPC;
>
> __set_bit(t, bitmap->bitmap);
> if (++bitmap->nr_busy == IDA_BITMAP_BITS)
> idr_mark_full(pa, idr_id);
>
> *p_id = id;
>
> /* Each leaf node can handle nearly a thousand slots and the
> * whole idea of ida is to have small memory foot print.
> * Throw away extra resources one by one after each successful
> * allocation.
> */
> if (ida->idr.id_free_cnt || ida->free_bitmap) {
> struct idr_layer *p = get_from_free_list(&ida->idr);
> if (p)
> kmem_cache_free(idr_layer_cache, p);
> }
>
> return 0;
> }
> EXPORT_SYMBOL(ida_get_new_above);
>
> /**
> * ida_get_new - allocate new ID
> * @ida: idr handle
> * @p_id: pointer to the allocated handle
> *
> * Allocate new ID. It should be called with any required locks.
> *
> * If memory is required, it will return -EAGAIN, you should unlock
> * and go back to the idr_pre_get() call. If the idr is full, it will
> * return -ENOSPC.
> *
> * @id returns a value in the range 0 ... 0x7fffffff.
> */
> int ida_get_new(struct ida *ida, int *p_id)
> {
> return ida_get_new_above(ida, 0, p_id);
> }
> EXPORT_SYMBOL(ida_get_new);
>
> /**
> * ida_remove - remove the given ID
> * @ida: ida handle
> * @id: ID to free
> */
> void ida_remove(struct ida *ida, int id)
> {
> struct idr_layer *p = ida->idr.top;
> int shift = (ida->idr.layers - 1) * IDR_BITS;
> int idr_id = id / IDA_BITMAP_BITS;
> int offset = id % IDA_BITMAP_BITS;
> int n;
> struct ida_bitmap *bitmap;
>
> /* clear full bits while looking up the leaf idr_layer */
> while ((shift > 0) && p) {
> n = (idr_id >> shift) & IDR_MASK;
> __clear_bit(n, &p->bitmap);
> p = p->ary[n];

OK, the caller presumably holds the update-side lock.

> shift -= IDR_BITS;
> }
>
> if (p == NULL)
> goto err;
>
> n = idr_id & IDR_MASK;
> __clear_bit(n, &p->bitmap);
>
> bitmap = (void *)p->ary[n];

OK, the caller presumably holds the update-side lock.

> if (!test_bit(offset, bitmap->bitmap))
> goto err;
>
> /* update bitmap and remove it if empty */
> __clear_bit(offset, bitmap->bitmap);
> if (--bitmap->nr_busy == 0) {
> __set_bit(n, &p->bitmap); /* to please idr_remove() */
> idr_remove(&ida->idr, idr_id);
> free_bitmap(ida, bitmap);
> }
>
> return;
>
> err:
> printk(KERN_WARNING
> "ida_remove called for id=%d which is not allocated.\n", id);
> }
> EXPORT_SYMBOL(ida_remove);
>
> /**
> * ida_destroy - release all cached layers within an ida tree
> * ida: ida handle
> */
> void ida_destroy(struct ida *ida)
> {
> idr_destroy(&ida->idr);
> kfree(ida->free_bitmap);
> }
> EXPORT_SYMBOL(ida_destroy);
>
> /**
> * ida_init - initialize ida handle
> * @ida: ida handle
> *
> * This function is use to set up the handle (@ida) that you will pass
> * to the rest of the functions.
> */
> void ida_init(struct ida *ida)
> {
> memset(ida, 0, sizeof(struct ida));
> idr_init(&ida->idr);
>
> }
> EXPORT_SYMBOL(ida_init);
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/