Re: [PATCH] Memory management livelock

From: Mikulas Patocka
Date: Fri Oct 03 2008 - 07:26:42 EST


> > *What* is, forever? Data integrity syncs should have pages operated on
> > in-order, until we get to the end of the range. Circular writeback could
> > go through again, possibly, but no more than once.
>
> OK, I have been able to reproduce it somewhat. It is not a livelock,
> but what is happening is that direct IO read basically does an fsync
> on the file before performing the IO. The fsync gets stuck behind the
> dd that is dirtying the pages, and ends up following behind it and
> doing all its IO for it.
>
> The following patch avoids the issue for direct IO, by using the range
> syncs rather than trying to sync the whole file.
>
> The underlying problem I guess is unchanged. Is it really a problem,
> though? The way I'd love to solve it is actually by adding another bit
> or two to the pagecache radix tree, that can be used to transiently tag
> the tree for future operations. That way we could record the dirty and
> writeback pages up front, and then only bother with operating on them.
>
> That's *if* it really is a problem. I don't have much pity for someone
> doing buffered IO and direct IO to the same pages of the same file :)

LVM does (that is where the bug was discovered). Basically, it scans all
the block devices with direct IO and if someone else does buffered IO on
any device simultaneously, it locks up.

That fsync-vs-write livelock is quite improbably (why would some
application do it?) --- although it could be used as a DoS --- getting
unkillable process.

But there is another possible real-world problem --- sync-vs-write ---
i.e. admin plugs in two disks and copies data from one to the other.
Meanwhile, some unrelated server process executes sync(). The server goes
into coma until the copy finishes.

Mikulas
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/