Re: invalidate_inode_pages in 2.5.32/3

From: Chuck Lever (cel@citi.umich.edu)
Date: Mon Sep 09 2002 - 18:51:40 EST


On Tue, 10 Sep 2002, Daniel Phillips wrote:

> On Tuesday 10 September 2002 00:03, Andrew Morton wrote:
> > Daniel Phillips wrote:
> > >
> > > > I'm very unkeen about using the inaccurate invalidate_inode_pages
> > > > for anything which matters, really. And the consistency of pagecache
> > > > data matters.
> > > >
> > > > NFS should be using something stronger. And that's basically
> > > > vmtruncate() without the i_size manipulation.
> > >
> > > Yes, that looks good. Semantics are basically "and don't come back
> > > until every damm page is gone" which is enforced by the requirement
> > > that we hold the mapping->page_lock though one entire scan of the
> > > truncated region. (Yes, I remember sweating this one out a year
> > > or two ago so it doesn't eat 100% CPU on regular occasions.)
> > >
> > > So, specifically, we want:
> > >
> > > void invalidate_inode_pages(struct inode *inode)
> > > {
> > > truncate_inode_pages(mapping, 0);
> > > }
> > >
> > > Is it any harder than that?
> >
> > Pretty much - need to leave i_size where it was.
>
> This doesn't touch i_size.
>
> > But there are
> > apparently reasons why NFS cannot sleepingly lock pages in this particular
> > context.
>
> If only we knew what those were. It's hard to keep the word 'bogosity'
> from popping into my head.

rpciod must never call a function that sleeps. if this happens, the whole
NFS client stops working until the function wakes up again. this is not
really bogus -- it is similar to restrictions placed on socket callbacks.

async RPC tasks (ie, the rpciod process) invokes invalidate_inode_pages
during normal, everyday processing, so it must not sleep. that's why it
today ignores locked pages.

thus:

1. whatever function purges a file's cached data must not sleep when
    invoked from an async RPC task. likewise, such a function must not
    sleep if the caller holds the file's i_sem.

2. access to the file must be serialized somehow with in-flight I/O
    (locked pages). we don't want to start new reads before all dirty
    pages have been flushed back to the server. dirty pages that have
    not yet been scheduled must be dealt with correctly.

3. mmap'd pages must behave reasonably when a file's cache is purged.
    clean pages should be faulted back in. what to do with dirty mmap'd
    pages?

        - Chuck Lever

--
corporate:	<cel at netapp dot com>
personal:	<chucklever at bigfoot dot com>

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sun Sep 15 2002 - 22:00:19 EST