Re: [PATCH] Check for compound pages in set_page_dirty()

From: Jens Axboe
Date: Thu Jul 19 2007 - 04:41:37 EST


On Thu, Jul 19 2007, Jens Axboe wrote:
> On Wed, Jul 18 2007, Hugh Dickins wrote:
> > On Wed, 18 Jul 2007, Jens Axboe wrote:
> > >
> > > Since I had my hands dirty already...
> >
> > Great, thanks. (There's also such a test in fs/nfs/direct.c,
> > but let's not trouble Trond until we've settled what to do here.)
> >
> > >
> > > ---
> > >
> > > [PATCH] Remove PageCompound() checks before calling set_page_dirty()
> > >
> > > Pre commit 41d78ba55037468e6c86c53e3076d1a74841de39 it was illegal
> > > to call set_page_dirty() on a compound page, since it stored the
> > > destructor in the mapping field. But now it's ok, so remove the
> > > ugly PageCompound() checks from bio and direct-io.
> > >
> > > Signed-off-by: Jens Axboe <jens.axboe@xxxxxxxxxx>
> >
> > I was about to Ack that, now that I've found something or other in the
> > libhugetlb testsuite comes this way, even on page[1], without showing
> > any problem.
> >
> > However, I have noticed a particular inefficiency arising: that
> > bio_check_pages_dirty test specifically avoids pages already
> > PageDirty; but hugetlbfs_set_page_dirty carefully redirects to
> > set the head page dirty: so tail pages of a hugetlb compound page
> > will tend never to be PageDirty, and keep on coming back this way.
> >
> > Which led me to look up the origin of those PageCompound tests:
> > Author: Andrew Morton <akpm@xxxxxxxx>
> > Date: Sun Sep 21 01:42:22 2003 -0700
> >
> > [PATCH] Speed up direct-io hugetlbpage handling
> >
> > This patch short-circuits all the direct-io page dirtying logic for
> > higher-order pages. Without this, we pointlessly bounce BIOs up to keventd
> > all the time.
> >
> > diff --git a/fs/bio.c b/fs/bio.c
> > index d016523..2463163 100644
> > --- a/fs/bio.c
> > +++ b/fs/bio.c
> > @@ -532,6 +532,12 @@ void bio_unmap_user(struct bio *bio, int write_to_vm)
> > * check that the pages are still dirty. If so, fine. If not, redirty them
> > * in process context.
> > *
> > + * We special-case compound pages here: normally this means reads into hugetlb
> > + * pages. The logic in here doesn't really work right for compound pages
> > + * because the VM does not uniformly chase down the head page in all cases.
> > + * But dirtiness of compound pages is pretty meaningless anyway: the VM doesn't
> > + * handle them at all. So we skip compound pages here at an early stage.
> > ...
> >
> > It looks like I was wrong in thinking it was just trying to avoid
> > the crash on page[1].mapping. At the least, your patch needs also
> > to remove that paragraph of comment from Andrew. But really, it
> > looks like those PageCompound tests should stay, unless you can
> > persuade Andrew to Ack their removal.
> >
> > Except (now, how many times can I change my mind in the course of
> > one email?), hugetlbfs_set_page_dirty was specifically added by
> > Ken Chen to avoid losing data via /proc/sys/vm/drop_caches. Yet
> > fs/bio.c is carefully avoiding going there when dirtying a hugepage.
> > How does this work? Looks like those PageCompound tests need to go!
>
> Hehe, that didn't really get us much further, did it? :-)
>
> My opinion is that since the win is marginal at best, we want to remove
> such tests as it just clutters up the code. And it's definitely not
> obvious why the tests are there, since they are not commented at all.
> Since it's even confusing you, then we can't expect the more vm ignorant
> of us (which definitely includes me) to grasp it!
>
> > I'm lost: I hope Andrew and Ken can sort it out for us.
>
> Posting a revised version, still leaving nfs out of it (I'll ping Trond
> to do the same, if this goes in).

FWIW, I ran a hugepage+O_DIRECT read test, that will cause the direct-io
code to hit set_page_dirty() for a compound page - and it works fine for
me. The fio job file used was:

[global]
directory=/data1
bs=4k
direct=1
hugepage-size=4m
iomem=shmhuge

[iothread]
filename=testfile
size=128m
rw=read

Test box is a lowly x86.

--
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/