Re: [PATCH 2/3] Fix fsync livelock

From: Mikulas Patocka
Date: Sun Oct 05 2008 - 19:18:50 EST


On Sun, 5 Oct 2008, Arjan van de Ven wrote:

> On Sun, 5 Oct 2008 19:02:57 -0400 (EDT)
> Mikulas Patocka <mpatocka@xxxxxxxxxx> wrote:
>
> > > are you sure?
> > > isn't the right fix to just walk the file pages only once?
> >
> > It walks the pages only once --- and waits on each on them. But
> > because new pages are contantly appearing under it, that "walk only
> > once" becomes infinite loop (well, finite, until the whole disk is
> > written).
>
> well. fsync() promises that everything that's dirty at the time of the
> call will hit the disk. That is not something you can compromise.
> The only way out would be is to not allow new dirty during an fsync()...
> which is imo even worse.
>
> Submit them all in one go, then wait, should not be TOO bad. Unless a
> lot was dirty already, but then you go back to "but it has to go to
> disk".

The problem here is that you have two processes --- one is writing, the
other is simultaneously syncing. The syncing process can't distinguish the
pages that were created before fsync() was invoked (it has to wait on
them) and the pages that were created while fsync() was running (it
doesn't have to wait on them) --- so it waits on them all. The result is
livelock, it waits indefinitely, because more and more pages are being
created.

The patch changes it so that if it waits long enough, it stops the other
writers creating dirty pages.

Or, how otherwise would you implement "Submit them all in one go, then
wait"? The current code is:
you grab page 0, see it is under writeback, wait on it
you grab page 1, see it is under writeback, wait on it
you grab page 2, see it is under writeback, wait on it
you grab page 3, see it is under writeback, wait on it
...
--- and the other process is just making more and more writeback pages
while your waiting routine run. So the waiting is indefinite.

Mikulas
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/