Re: NFS BUG_ON in nfs_do_writepage

From: Nick Piggin
Date: Tue Apr 28 2009 - 07:55:11 EST


On Tue, Apr 28, 2009 at 07:45:17AM -0400, Trond Myklebust wrote:
> On Tue, 2009-04-28 at 06:27 +0200, Nick Piggin wrote:
> > On Sun, Apr 26, 2009 at 01:55:22PM -0400, Trond Myklebust wrote:
> > > On Sun, 2009-04-26 at 17:13 +0200, Nick Piggin wrote:
> > > > This doesn't seem to fix the race, though... on kernels with the
> > > > race still there, it will just open a window where you can have
> > > > a dirty pte but the page not written out.
> > > >
> > > > I don't understand.
> > >
> > > I'm just pointing out that the NFS client already calls
> > > __set_page_dirty_nobuffers() while holding the page lock inside the
> > > nfs_vm_page_mkwrite() call, so having the VM do it too in the call to
> > > set_page_dirty_balance() is actually redundant. IOW: as far as the NFS
> > > code is concerned, we can get rid of the ->set_page_dirty() callback in
> > > that situation.
> > >
> > > I couldn't find any other places in the VM code where we can have a
> > > dirty pte without also having called page_mkwrite() (and hence
> > > __set_page_dirty_nobuffers). As I said, adding a WARN_ON(!PageDirty())
> > > in ->set_page_dirty() didn't ever trigger any cases where the
> > > set_page_dirty() was actually setting the dirty bit (except in the case
> > > where we race with page writeout in do_wp_page() and __do_fault()).
> > >
> > > That's why I believe disabling ->set_page_dirty() is safe here, and will
> > > in fact suffice to fix the page writeout race.
> >
> > Ah, no I don't think so because it opens another race where the
> > pte is dity but the page is marked clean.
>
> So how can that happen?

If the page gets cleaned after page_mkwrite and before the page
table locks are taken again in order to set the pte writeable.
(actually, page_mkclean only runs if it finds mapcount elevated,
so it is enough to clean the page even after the locks are taken
and before mapcount is incremented in the case of __do_fault).


> AFAICS, when the pte is dirtied, we should get a page fault, which
> causes the page itself to be marked dirty by the nfs_vm_page_mkwrite()
> callback.
> When the page gets written out, the VM calls clear_page_dirty_for_io()
> which also causes the pte to be cleaned.
>
> At what point can you therefore have a situation where the pte is dirty
> without the page being marked as dirty too?


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/