Re: [PATCH 5/9] writeback: introduce the pageout work

From: Jan Kara
Date: Fri Mar 09 2012 - 05:16:07 EST


On Sat 03-03-12 21:55:58, Wu Fengguang wrote:
> On Fri, Mar 02, 2012 at 11:57:00AM -0800, Andrew Morton wrote:
> > On Fri, 2 Mar 2012 18:39:51 +0800
> > Fengguang Wu <fengguang.wu@xxxxxxxxx> wrote:
> >
> > > > And I agree it's unlikely but given enough time and people, I
> > > > believe someone finds a way to (inadvertedly) trigger this.
> > >
> > > Right. The pageout works could add lots more iput() to the flusher
> > > and turn some hidden statistical impossible bugs into real ones.
> > >
> > > Fortunately the "flusher deadlocks itself" case is easy to detect and
> > > prevent as illustrated in another email.
> >
> > It would be a heck of a lot safer and saner to avoid the iput(). We
> > know how to do this, so why not do it?
>
> My concern about the page lock is, it costs more code and sounds like
> hacking around something. It seems we (including me) have been trying
> to shun away from the iput() problem. Since it's unlikely we are to
> get rid of the already existing iput() calls from the flusher context,
> why not face the problem, sort it out and use it with confident in new
> code?
We can get rid of it in the current code - see my patch set. And also we
don't have to introduce new iput() with your patch set... I don't think
using ->writepage() directly on a locked page would be a good thing because
filesystems tend to ignore it completely (e.g. ext4 if it needs to do an
allocation, or btrfs) or are much less efficient than when ->writepages()
is used. So I'd prefer going through writeback_single_inode() as the rest
of flusher thread.

> Let me try it now. The only scheme iput() can deadlock the flusher is
> for the iput() path to come back to queue some work and wait for it.
Let me stop you right here. You severely underestimate the complexity of
filesystems :). Take for example ext4. To do truncate you need to start a
transaction, to start a transaction, you have to have a space in journal.
To have a space in journal, you may have to wait for any other process to
finish writing. If that process needs to wait for flusher thread to be able
to finish writing, you have a deadlock. And there are other implicit
dependencies like this. And it's similar for other filesystems as well. So
you really want to make flusher thread as light as possible with the
dependencies.

Honza
--
Jan Kara <jack@xxxxxxx>
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/