How Not To Use kref (was Re: kdbus: add code for buses, domains and endpoints)

From: Al Viro
Date: Thu Oct 30 2014 - 19:38:16 EST


On Wed, Oct 29, 2014 at 03:00:52PM -0700, Greg Kroah-Hartman wrote:

> +static void __kdbus_domain_user_free(struct kref *kref)
> +{
> + struct kdbus_domain_user *user =
> + container_of(kref, struct kdbus_domain_user, kref);
> +
> + BUG_ON(atomic_read(&user->buses) > 0);
> + BUG_ON(atomic_read(&user->connections) > 0);
> +
> + mutex_lock(&user->domain->lock);
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> + idr_remove(&user->domain->user_idr, user->idr);
> + hash_del(&user->hentry);
^^^^^^^^^^^^^^^^^^^^^^^^
> + mutex_unlock(&user->domain->lock);
> +
> + kdbus_domain_unref(user->domain);
> + kfree(user);
> +}

> +struct kdbus_domain_user *kdbus_domain_user_unref(struct kdbus_domain_user *u)
> +{
> + if (u)
> + kref_put(&u->kref, __kdbus_domain_user_free);
> + return NULL;
> +}

If you remove an object from some search structures, taking the lock in
destructor is Too Fucking Late(tm). Somebody might have already found
that puppy and decided to pick it (all under that lock) just as we'd
got to that point in destructor and blocked there. Oops...

Normally I'd say "just use kref_put_mutex()", but this case is even worse.
Look:

refcount is 1
A: kref_put_mutex()
see that it's potential 1->0 crossing, need to take mutex
mutex_lock()
B: kref_get()
refcount is 2
A: got the sodding mutex
atomic_dec_and_test
refcount is 1 now
OK, it's not 1->0, after all, just drop the mutex and bugger off
B: kref_put_mutex()
see that it's potential 1->0 crossing, need to take mutex
mutex_lock() blocks
A: mutex_unlock() lets B go
B: ... got it
atomic_dec_and_test
refcount is 0
call the destructor now, which ends with
kdbus_domain_unref(user->domain);
... which just happens to be the last reference to ->domain
... and frees it, along with ->domain->mutex

But what's to guarantee that A will be past the last point where mutex_unlock()
is looking at the mutex? Sure, it's hard to hit, but AFAICS it's not
impossible, especially if the following happens (assuming mutex-dec.h-style
mutices):

B: mutex_lock()
atomic_dec_return -> -1
__mutex_lock_slowpath()
A: mutex_unlock()
atomic_inc_return -> 0
get preempted
B: note that A has already incremented it to 0 and bugger off - we'd got it

and there we go, with A getting the timeslice back and deciding to call
__mutex_unlock_slowpath() when B has already freed the damn thing.

Basically, kref_put_mutex() is only usable when destructor callback cannot end
up freeing the mutex.

kref_get_unless_zero() might be a usable approach, but IMO the whole thing is
simply outside of kref applicability. Using it for something that needs to
deal with removal from search structures from the destructor callback is
already stretching the things; this one is far worse. kref isn't a universal
tool for expressing lifetime cycles. It works for really simple cases and
might eliminate some amount of boilerplate code. It's been greatly oversold
and overused, though...
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/