Re: invalidate_inode_pages in 2.5.32/3

From: Andrew Morton (akpm@zip.com.au)
Date: Thu Sep 05 2002 - 16:12:23 EST


Trond Myklebust wrote:
>
> >>>>> " " == Andrew Morton <akpm@zip.com.au> writes:
>
> > Probably, it worked OK with the global locking because nobody
> > was taking a temp ref against those pages.
>
> But we still have global locking in that readdir by means of the
> BKL. There should be no nasty races...
>
> > Please tell me exactly what semantics NFS needs in there. Does
> > truncate_inode_pages() do the wrong thing?
>
> Definitely. In most cases we *cannot* wait on I/O completion because
> nfs_zap_caches() can be called directly from the 'rpciod' process by
> means of a callback. Since 'rpciod' is responsible for completing all
> asynchronous I/O, then truncate_inode_pages() would deadlock.

But if there are such pages, invalidate_inode_pages() would not
have removed them anyway?

> As I said: I don't believe the problem here has anything to do with
> invalidate_inode_pages vs. truncate_inode_pages:
> - Pages should only be locked if they are actually being read from
> the server.
> - They should only be refcounted and/or cleared while the BKL is
> held...
> There is no reason why code which worked fine under 2.2.x and 2.4.x
> shouldn't work under 2.5.x.

Trond, there are very good reasons why it broke. Those pages are
visible to the whole world via global data structures - both the
page LRUs and via the superblocks->inodes walk. Those things exist
for legitimate purposes, and it is legitimate for async threads
of control to take a reference on those pages while playing with them.

It just "happened to work" in earlier kernels.

I suspect we can just remove the page_count() test from invalidate
and that will fix everything up. That will give stronger invalidate
and anything which doesn't like that is probably buggy wrt truncate anyway.

Could you test that?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sat Sep 07 2002 - 22:00:26 EST