Re: processes hung in D (raid5/dm/ext3)

From: Andrew Morton
Date: Tue Jun 15 2004 - 18:28:49 EST


foo@xxxxxxxxxxxxxxxxx wrote:
>
> On Tue, Jun 15, 2004 at 03:09:32AM -0700, Andrew Morton wrote:
> > Wait for it to happen again, then do
> >
> > echo t > /proc/sysrq-trigger
> > dmesg -s 1000000 > foo
> >
> > then send foo, foo.
>
> Here you go. I did dump 2f /dev/null /export/home. The first time it
> completed, the second it hung right away.
>
> This is with 2.6.7-rc3-bk6 with no other patches, also compiled with a
> newer gcc.
>
> ...

> dump D 000017f4796ce9b2 0 1887 1853 (NOTLB)
> 000001001ae8dbb8 0000000000000006 000001003ffd5948 000001006a6f96f0
> 0000000000009235 ffffffff803e23a0 000001006a6f9a08 ffffffff802fd643
> 0000000000000212 0000000000000001
> Call Trace:<ffffffff802fd643>{raid5_unplug_device+291} <ffffffff80377d0b>{io_schedule+43}
> <ffffffff80155f6a>{__lock_page+250} <ffffffff80155c40>{page_wake_function+0}
> <ffffffff80155c40>{page_wake_function+0} <ffffffff801567f6>{do_generic_mapping_read+502}
> <ffffffff80156a70>{file_read_actor+0} <ffffffff80156d14>{__generic_file_aio_read+372}
> <ffffffff80156dfb>{generic_file_read+123} <ffffffff80169994>{handle_mm_fault+292}
> <ffffffff80122ad8>{do_page_fault+440} <ffffffff80130948>{recalc_task_prio+424}
> <ffffffff8037748c>{thread_return+41} <ffffffff8017bca7>{vfs_read+199}
> <ffffffff8017bf19>{sys_read+73} <ffffffff80123f81>{ia32_sysret+0}
>

OK, well I'd be suspecting that either devicemapper or raid5 lost an I/O
completion, causing that page to never be unlocked.

Please try the latest -mm kernel, which has a few devicemapper changes,
although they are unlikely to fix this.

If it's possible to remove either raid5 or devicemapper from the picture,
that would help us find the problem.

Other than that, the chances of getting this fixed are proportional to your
skill in finding us a way of reproducing it. A good start would be to tell
us exactly which commands were used to set up the LVM and the raid array.
That way a raid/LVM ignoramus like me can take a look ;)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/