On Tue, 16 Jul 2013 19:06:30 +0200 Jan Kara <jack@xxxxxxx> wrote:
With users of radix_tree_preload() run from interrupt (CFQ is one such
possible user), the following race can happen:
radix_tree_preload()
...
radix_tree_insert()
radix_tree_node_alloc()
if (rtp->nr) {
ret = rtp->nodes[rtp->nr - 1];
<interrupt>
...
radix_tree_preload()
...
radix_tree_insert()
radix_tree_node_alloc()
if (rtp->nr) {
ret = rtp->nodes[rtp->nr - 1];
And we give out one radix tree node twice. That clearly results in radix
tree corruption with different results (usually OOPS) depending on which
two users of radix tree race.
Fix the problem by disabling interrupts when working with rtp variable.
In-interrupt user can still deplete our preloaded nodes but at least we
won't corrupt radix trees.
...
There are some questions regarding this patch:
Do we really want to allow in-interrupt users of radix_tree_preload()? CFQ
could certainly do this in older kernels but that particular call site where I
saw the bug hit isn't there anymore so I'm not sure this can really happen with
recent kernels.
Well, it was never anticipated that interrupt-time code would run
radix_tree_preload(). The whole point in the preloading was to be able
to perform GFP_KERNEL allocations before entering the spinlocked region
which needs to allocate memory.
Doing all that from within an interrupt is daft, because the interrupt code
can't use GFP_KERNEL anyway.
Also it is actually harmful to do preloading if you are in interrupt context
anyway. The disadvantage of disallowing radix_tree_preload() in interrupt is
that we would need to tweak radix_tree_node_alloc() to somehow recognize
whether the caller wants it to use preloaded nodes or not and that callers
would have to get it right (although maybe some magic in radix_tree_preload()
could handle that).
Opinions?
BUG_ON(in_interrupt()) :)