Re: + delay-accounting-taskstats-interface-send-tgid-once.patch addedto -mm tree

From: Shailabh Nagar
Date: Tue Jun 27 2006 - 12:04:45 EST


Shailabh Nagar wrote:

Arjan van de Ven wrote:

also in response to
Subject: Re: 2.6.17-mm3
Date: Tue, 27 Jun 2006 16:12:42 +0200
with Message-ID:
<6bffcb0e0606270712w166f04a6u237d695e2bfa1913@xxxxxxxxxxxxxx>

On Mon, 2006-06-26 at 12:06 -0700, akpm@xxxxxxxx wrote:



+static inline void taskstats_tgid_alloc(struct signal_struct *sig)
+{
+ struct taskstats *stats;
+
+ stats = kmem_cache_zalloc(taskstats_cache, SLAB_KERNEL);
+ if (!stats)
+ return;
+
+ spin_lock(&sig->stats_lock);
+ if (!sig->stats) {


+static inline void taskstats_tgid_free(struct signal_struct *sig)
+{
+ struct taskstats *stats = NULL;
+ spin_lock(&sig->stats_lock);
+ if (sig->stats) {
+ stats = sig->stats;
+ sig->stats = NULL;
+ }
+ spin_unlock(&sig->stats_lock);
+ if (stats)
+ kmem_cache_free(taskstats_cache, stats);
+}

this is buggy and deadlock prone!
(found by lockdep)

stats_lock nests within tasklist_lock, which is taken in irq context.
(and thus needs irq_save treatment). But because of this nesting, it is
not allowed to take stats_lock without disabling irqs, or you can have
the following scenario

CPU 0 CPU 1
(in irq) (in the code above)
stats_lock is taken
tasklist_lock is taken stats_lock_is taken <spin> <interrupt happens>
tasklist_lock is taken
which now forms an AB-BA deadlock!


this happens at least in copy_process which can call taskstats_tgid_free
without first disabling interrupts (via cleanup_signal).


Arjan,

Didn't get how stats_lock is nesting within tasklist_lock in copy_process ?
The call to cleanup_signal (or any call to taskstats_tgid_alloc/free) seems to be happening
outside of holding the tasklist lock ? Or maybe I'm missing something.

Yup...indeed I was.

In release_task(), __cleanup_signal() is being called within tasklist_lock
protection and that leads to stats_lock being held nested within tasklist_lock.


Changing to an irqsave variant isn't a problem of course...

--Shailabh

There may be
many other cases, I've not checked deeper yet.

Solution should be to make these functions use irqsave variant... any
comments from the authors of this patch ?

Will submit a patch to switch to irqsave variants.

Thanks,
Shailabh



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/