Re: [BUG 2.6.36-rc6] list corruption in module_bug_finalize

From: Thomas Gleixner
Date: Mon Oct 04 2010 - 18:44:02 EST


On Mon, 4 Oct 2010, Arnd Bergmann wrote:

> On Sunday 03 October 2010, Thomas Gleixner wrote:
> > Current mainline triggers a list corruption bug in
> > module_bug_finalize(). dmesg excerpt below.
> >
> > The corresponding code says:
> >
> > /*
> > * Strictly speaking this should have a spinlock to protect against
> > * traversals, but since we only traverse on BUG()s, a spinlock
> > * could potentially lead to deadlock and thus be counter-productive.
> > */
> > list_add(&mod->bug_list, &module_bug_list);
> >
> > I can see the traversal problem vs. BUG(), but what's protecting the
> > list_add() ? BKL probably did, but is that true anymore ?
>
> BKL hasn't been in this code path since before git.

Fair enough. I have to admit that I did not even look. :)

> I think this relatively recent change caused module_finalize to be
> called without module_mutex held:

Yeah.

> commit 75676500f8298f0ee89db12db97294883c4b768e
> Author: Rusty Russell <rusty@xxxxxxxxxxxxxxx>
> Date: Sat Jun 5 11:17:36 2010 -0600
>
> module: make locking more fine-grained.
>
> Kay Sievers <kay.sievers@xxxxxxxx> reports that we still have some
> contention over module loading which is slowing boot.
>
> Linus also disliked a previous "drop lock and regrab" patch to fix the
> bne2 "gave up waiting for init of module libcrc32c" message.
>
> This is more ambitious: we only grab the lock where we need it.
>
> Signed-off-by: Rusty Russell <rusty@xxxxxxxxxxxxxxx>
> Cc: Brandon Philips <brandon@xxxxxxxx>
> Cc: Kay Sievers <kay.sievers@xxxxxxxx>
> Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
>
> Arnd

The patch below cures it.

Thanks,

tglx

---->
diff --git a/lib/bug.c b/lib/bug.c
index 7cdfad8..40f32d8 100644
--- a/lib/bug.c
+++ b/lib/bug.c
@@ -92,18 +92,21 @@ int module_bug_finalize(const Elf_Ehdr *hdr, const Elf_Shdr *sechdrs,
}

/*
- * Strictly speaking this should have a spinlock to protect against
- * traversals, but since we only traverse on BUG()s, a spinlock
- * could potentially lead to deadlock and thus be counter-productive.
+ * We need to take module_mutex here to protect the list add, though
+ * it won't protect against a concurrent BUG().
*/
+ mutex_lock(&module_mutex);
list_add(&mod->bug_list, &module_bug_list);
+ mutex_unlock(&module_mutex);

return 0;
}

void module_bug_cleanup(struct module *mod)
{
+ mutex_lock(&module_mutex);
list_del(&mod->bug_list);
+ mutex_unlock(&module_mutex);
}

#else
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/