Re: Deadlock on concurrent remount-ro and mdadm --stop (2.6.37)

From: Neil Brown
Date: Mon Jan 24 2011 - 17:58:14 EST


On Sun, 23 Jan 2011 17:15:00 +0100
Harald Braumann <harry@xxxxxxxxxx> wrote:

> On Fri, Jan 21, 2011 at 01:47:54PM +0100, Harald Braumann wrote:
> > On shutdown `umountroot' and `mdadm-raid' are executed
> > concurrently. The actual commands are:
> > mount -o remount,ro /
> > mdadm --stop --scan
> >
> > This seems to trigger a deadlock in the kernel. Dumping blocked
> > tasks (sysrq-W) gives:
> >
> > md126_raid5: call trace:
> > md_super_wait
> > autoremove_wake_function
> > bitmap_unplug
> > ...
> >
> > mount: call trace:
> > sync_page
> > io_schedule
> > sync_page
>
> I've now disable parallel boot in my system, so umountroot and mdadm
> shouldn't be called cuncurrently anymore. But I still get lock-ups on
> mdadm --stop. Umountroot is called before mdadm --stop, and all other
> filesystems are unmounted before that, so I guess
> there should be no `sync_page' in progress anymore. But I also have
> swap on the raid, so maybe there's a problem?
>
> As not even the magic SysRQ keys work, I can't provide any more
> details. Any suggestions how I could debug this problem?

Thanks for the bug report.
Extra useful details would include:
cat /proc/mdstat
cat /proc/mounts

i.e. what is the exact configuration of the RAID or RAIDs, and where
are they mounted.

Thanks,
NeilBrown
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/