Re: How to manage shared persistent local caching (FS-Cache) with NFS?

From: Chuck Lever
Date: Fri Dec 07 2007 - 13:01:22 EST

Next message: Neil Horman: "Re: [PATCH] kexec: force x86_64 arches to boot kdump kernels onboot cpu"
Previous message: Linus Torvalds: "Re: Major regression on hackbench with SLUB"
In reply to: David Howells: "Re: How to manage shared persistent local caching (FS-Cache) with NFS?"
Next in thread: David Howells: "Re: How to manage shared persistent local caching (FS-Cache) with NFS?"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Hi David-

[ Some history snipped... ]

On Dec 6, 2007, at 3:00 PM, David Howells wrote:

Chuck Lever <chuck.lever@xxxxxxxxxx> wrote:
Is it a problem because, if there are multiple copies of the same remote file
in its cache, then FS-cache doesn't know, upon reconnection, which item to
match against a particular remote file?

There are multiple copies of the same remote file that are described by the
same remote parameters. Same IP address, same port, same NFS version, same
FSID, same FH. The difference may be a local connection parameter.

Why not encode the local mounted-on directory in the key? A cryptographic hash of the directory's absolute pathname would be bounded in size. And the mounted-on directory is usually persistent across client reboots.

That way you can use the directory name hash to distinguish the different views of the same remote object.

An adequate first pass at FS-cache can be done without guaranteeing
persistence.

True. But it's not particularly interesting to me in such a case.

There are a host of other issues that need exposure -- steady-state
performance;

Meaning what?

Meaning your cache is at quota all the time, and to continue operation it must eject items constantly.

This is a scenario where it pays to cache the read-mostly items on disk, and leave the frequently changing items in memory.

The economics of disk caches is different than memory caches. Disk caches are much larger and cheaper, but their performance tanks when they have to track frequently changing files. Memory caches are smaller, but tracking frequently changing data is only a little more expensive than tracking data that doesn't change often.

I have been measuring the performance improvement and degradation numbers, and
I can say that if you've one client and one server, the server has all the
files in memory, and there's gigabit ethernet between them, an on- disk cache
really doesn't help.

Basically, the consideration of whether to use a cache is a compromise between
a host of factors.

cache garbage collection

Done.

and reclamation;

Done.

cache item aliasing;

Partly done.

whether all files on a mount point should be cached on disk, or some in
memory and some on disk;

I've thought about that, but no-one seems particularly interested in
discussing it.

I think it's key to preventing FS-cache from making performance worse in many common scenarios.

And what would it harm if FS-cache decides that certain items in its cache
have become ambiguous or otherwise unusable after a reconnection event, thus
it reclaims them instead of re-using them?

It depends.

At some point I'd like to make disconnected operation possible, and that means
storing data to be written back in the cache. You can't necessarily just
chuck that away.

Disconnected operation for NFS is fraught with challenges. Access to data on servers is traditionally gated by the client's IP address, for example. The client may disconnect from the network, then reconnect using a different address where suddenly all of its accesses are rebuffed.

NFS servers, not clients, traditionally determine the file's mtime and ctime, and its file handle. So file updates and file creation become problematic. The client has to reconcile the server's file handle, for files created offline, with its own when reconnecting.

And, for disconnected operation, the cache is required to contain every item from the remote. You can't just drop items from the cache because they are inconvenient.

I can't just say: "Well, it'll oops if you configure your NFS shares like
that,
so don't. It's not worth me implementing round it.".

What causes that instability? Why can't you insulate against the instability
but allow cache incoherence and aliased cache items?

Insulate how? The only way to do that is to add something to the cache key
that says that these two otherwise identical items are actually diffent
things.

That something might be the pathname of the mounted-on directory or of the file itself.

I'm arguing that cache coherence isn't supported by the NFS protocol, so how
can FS-cache *require* a facility to support persistent local caching that
the protocol doesn't have in the first place?

NFS has just enough to just about support a persistent local cache for
unmodified files. It has unique file keys per server, and it has a (limited)
amount of coherency data per file. That's not really the problem.

The problem is that the client can create loads of different views of a remote
export and the kernel treats them as if they're views of different remote
exports. These views do not necessarily have *anything* to distinguish them
at all (nosharecache option).

Yes, they do. The combination of mount options and mounted-on directory (or local pathname to the file) gives you a unique identity for that view.

Now, for the case of cached clients, we can enforce a reduction of incoherency
by requiring one remote inode maps to a single client inode if that inode is
going to be placed in the persistent cache.

That seems reasonable. Just don't cache the second and greater instances of the same remote file if FS-cache can't handle local aliases.

Invalidating is cheap for in-memory caches. Frequent invalidation is going
to be expensive for FS-cache, since it requires some disk I/O (and perhaps
even file truncation).

So what? That's one of the compromises you have to make if you want an
on-disk cache. The invalidation is asynchronous anyway.

So an item is cached in memory until space becomes available in the disk cache?

--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Neil Horman: "Re: [PATCH] kexec: force x86_64 arches to boot kdump kernels onboot cpu"
Previous message: Linus Torvalds: "Re: Major regression on hackbench with SLUB"
In reply to: David Howells: "Re: How to manage shared persistent local caching (FS-Cache) with NFS?"
Next in thread: David Howells: "Re: How to manage shared persistent local caching (FS-Cache) with NFS?"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]