Re: invalidate_inode_pages in 2.5.32/3

From: Daniel Phillips (
Date: Mon Sep 23 2002 - 14:13:26 EST

On Monday 23 September 2002 18:38, Trond Myklebust wrote:
> >>>>> " " == Andrew Morton <> writes:
> > Look, idunnoigiveup. Like scsi and USB, NFS is a black hole
> > where akpms fear to tread. I think I'll sulk until someone
> > explains why this work has to be performed in the context of a
> > process which cannot do it.
> I'd be happy to move that work out of the RPC callbacks if you could
> point out which other processes actually can do it.
> The main problem is that the VFS/MM has no way of relabelling pages as
> being invalid or no longer up to date: I once proposed simply clearing
> PG_uptodate on those pages which cannot be cleared by
> invalidate_inode_pages(), but this was not to Linus' taste.

I'll take a run at analyzing this.

First, it's clear why can't just set the page !uptodate: if we fail to
lock the page we can't change the state of the uptodate bit because we
would violate the locking rules, iow, we would race with the vfs (see

Note that even if succeed in the TryLock and set !uptodate, we still
have to walk the rmap list and unmap the page or it won't get refaulted
and the uptodate bit will be ignored.

For any page we can't lock without blocking, the cases are:

 1) Dirty: we don't need to invalidate it because it's going to get
    written back to the server anyway

 2) Locked, clean: the page could be locked for any number of reasons.
    Probably, it's locked for reading though. We *obviously* need to
    kill this page at some point or we have a nasty heisenbug. E.g.,
    somebody, somewhere, will get a file handed back to them from some
    other client that rewrote the whole thing, complete and correct
    except for a stale page or two.

For pages that we can lock, we have:

 3) Elevated count, clean: we could arguably ignore the use count
    and just yank the page out of the inode list, as Andrew's patch
    does. Getting it out of the mapping is harder, perhaps much

 4) Clean, has buffers, can't get rid of the buffers: we can't know
    why. HTree puts pages in this state for directory access, Ext3
    probably does it for a variety of reasons. Same situation as

Given the obviously broken case (2) above and the two probably broken
case (3) and (4), I don't see any way to ignore this problem and still
implement the NFS semantics Chuck described earlier.

I see Rik's suggestion of marking the problem pages invalid, and walking
the ptes to protect them as the cleanest fix. Unlike invalidate_inode_pages,
the fault path can block perfectly happily while the problem conditions
sort themselves out.

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to
More majordomo info at
Please read the FAQ at

This archive was generated by hypermail 2b29 : Mon Sep 23 2002 - 22:00:39 EST