Re: 2.6.39 Block layer regression was [Bug] Boot hangs with2.6.39-rc[123]]

From: Christoph Hellwig
Date: Fri Apr 15 2011 - 00:23:01 EST


On Thu, Apr 14, 2011 at 08:25:33PM -0700, Linus Torvalds wrote:
> What's the thinking there? It looks very confused to me.

It is. I sent a patch a couple of days ago to fix it.

> Now, clearly RAID seems to be involved in the problem? The main thing
> with that would be that the execution of the requests would tend to
> generate new requests, that go back on the plug queue. Yes? And the
> loop in flush_plug_list() means that they all should get flushed out,
> I assume. But something clearly isn't working, and it does seem to be
> about the RAID kind of setup. So either they didn't get put on the
> plug queue, or the task got a new plug (which _wasn't_ flushed).
>
> Because we're clearly waiting for some request that hasn't completed.
> Where in the plug queues would it be hiding?

There's a thread where Neil explains what the problem with MD is - it
needs a callback on unplug time to generate e.g. the write intent bitmap
or as large as possible writes for RAID5. Jens and Neil have been
looking into it.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/