Re: [PATCH] swapoff tmpfs radix_tree: remember to rcu_read_unlock

From: Hugh Dickins
Date: Sat Feb 15 2014 - 18:53:58 EST


On Thu, 13 Feb 2014, Andrew Morton wrote:
> On Wed, 12 Feb 2014 18:45:07 -0800 (PST) Hugh Dickins <hughd@xxxxxxxxxx> wrote:
>
> > Running fsx on tmpfs with concurrent memhog-swapoff-swapon, lots of
> >
> > BUG: sleeping function called from invalid context at kernel/fork.c:606
> > in_atomic(): 0, irqs_disabled(): 0, pid: 1394, name: swapoff
> > 1 lock held by swapoff/1394:
> > #0: (rcu_read_lock){.+.+.+}, at: [<ffffffff812520a1>] radix_tree_locate_item+0x1f/0x2b6
> > followed by
> > ================================================
> > [ BUG: lock held when returning to user space! ]
> > 3.14.0-rc1 #3 Not tainted
> > ------------------------------------------------
> > swapoff/1394 is leaving the kernel with locks still held!
> > 1 lock held by swapoff/1394:
> > #0: (rcu_read_lock){.+.+.+}, at: [<ffffffff812520a1>] radix_tree_locate_item+0x1f/0x2b6
> > after which the system recovered nicely.
> >
> > Whoops, I long ago forgot the rcu_read_unlock() on one unlikely branch.
> >
> > Fixes: e504f3fdd63d ("tmpfs radix_tree: locate_item to speed up swapoff")
>
> huh. Venerable. I'm surprised that such an obvious blooper wasn't
> spotted at review. Why didn't anyone else hit this.

No surprise that it missed review, obvious though it is in the fix.

And not much surprise that noone else hit this: for most people, even
those using tmpfs and pushing out to swap, swapoff is just something
that happens shortly before the screen goes blank when you shutdown
(and, I haven't noticed how distros order it these days, but swapoff
is anyway better done after unmounting tmpfss, to avoid its slowness).

And it does need the swapped tmpfs file to be truncated or unlinked
while swapoff is searching through it racily with RCU lookups.

What puzzled me more was, why hadn't I seen it before? I don't run
that fsx test particularly often, but have certainly run it dozens
of times between then and now. I think the answer must be where I
said "after which the system recovered nicely": I probably did hit
it before, but wasn't attending to the screen at the time, the
warnings got scrolled off by timestamps I was printing, and I
failed to check dmesg or /var/log/messages afterwards.

>
>
> > Of course, the truth is that I had been hoping to break Johannes's
> > patchset in mmotm, was thrilled to get this on that, then despondent
> > to realize that the only bug I had found was mine. Surprised I've
> > not seen it before in 2.5 years: tried again on 3.14-rc1, got the
> > same after 25 minutes. Probably not serious enough for -stable,
> > but please can we slip the fix into 3.14 - sorry, Johannes's
> > mm-keep-page-cache-radix-tree-nodes-in-check.patch will need a refresh.
>
> I fixed it up.

Thanks!

Hugh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/