>All the worlds a VAX syndrome again. Your program should not be making
>that assumption. Users aren't generally happy when they change their
>NFS server and a pile of junk applications fail.
Once again: my application does *not* fail, but attempts to do a graceful
recovery (there is no better solution across NFS).
Nevertheless, this is *beside* the point. We're talking about UNIX-UNIX
NFS systems and their NFS implementation.
>> If there is only one UNIX NFS-server, and there is only one UNIX NFS-client,
>> then the client OS *must* be able to get link() and stat() right in *all*
>> cases except for case when the NFS server should crash between having performed
>But then you don't need your locking scheme do you, so you've just reduced
>the problem to pointlessness.
No. I have not. Once you get the simple case right (which Linux's NFS
implementation apparently did not), you'll see that *automatically* some
(not all, very few, actually, but just enough to make your program reliable,
if programmed carefully enough) of the multiclient issues are working better
as well.
Anyway, reducing the problem helps to pinpoint the problem. In this case,
the multiclient case cannot work if not even the single client case
works. So, in order to concentrate on the first step, I point out
the single client case. Once that works, we can proceed to the multi-client
case, but that currently is not at hand (yet).
>> a successful link() and reporting back success to the client.
>> The stat() call should return accurate data in *all* cases. The current
>> caching implementation in Linux does not guarantee this.
>For any case where there are two or more clients it cannot return accurate
>data reliably in any case.
It can, if everyone is using a *different* file. But, as I said before,
this can only work if the single-client case works correctly.
>> I'm not saying that it has to support everything directly. But the
>> OS (Linux in this case) *must* emulate everything to the best of its
>> abilities (at least for the most simple case where it is the only
>Not in the NFS specification. It would be nice of it to provide you
>with Unix like link counts , but you couldn't ever use them for anything
>or assume they were valid except in the only client case, which you don't
>need it for anyway.
>From the program point of view, I don't even *know* if I'm on top of
an NFS filesystem. I'm programming in a UNIX/POSIX environment. It
has a link() command and an st_link attribute. It's the task of the
kernel to make sure the command and the attribute work in harmony.
Meaning: if link() returns success, and I am the only one accessing the
file, then st_link should increase. If it doesn't increase, it's the kernel
that is at fault. I don't care what the NFS specs say at this point. The
kernel can make it work, and it should. It's the kernel's task to translate
any NFS anomalies into something the application can understand.
Note: I did not discuss the possibility of the link() failing, which is
a slightly different matter.
>> but that's not the point here). Forgetting to increase the link count or
>> not flushing the cache is a blatant error.
>Nope its not in the NFS specification. Stop treating NFS like a Unix file
>system and you'll get a lot further.
I'm not using a function called NFS_link(). I'm using a function called
link(). The UNIX manual says it increases the hardlink count *if*
it is successful. That's the only thing I care about. link() has to
abide by the UNIX specs. NFS_link() can abide by the NFS specs, but I don't
use that function.
>NFS makes NO guarantees in any situation about supporting unix link counts,
>about supporting POSIX semantics, about caching coherency, about operation
>order or anything else.
If the NFS filesystem in question does not support hardlinks, then the
kernel must make sure that my application gets back an error when it
tries to perform a link(). That's all I need. A kernel that returns success
but doesn't create the hardlink is a bug.
>> >Even altering the cached attributes will not save you as your cache
>> >may be invalid anyway.
>> Not if you're the only client. I would expect *every* UNIX implementation
>> to get at least *that* right.
>Define 'right'. NFS isnt a unix specific file system so your notion of right
>is a bit strange.
The program runs in a UNIX kernel environment. The man pages/POSIX define
what is "right". The kernel should make sure that error codes are returned
if the NFS filesystem can't perform something normally possible on
UNIX filesystems. If the NFS filesystem in question *can* perform the action,
then a good UNIX-kernel-NFS-client implementation should be able to perform
the action.
> Since an operation can be indefinitely delayed your
>"only client" is actually "only client who has ever made requests on that
>inode since the beginning of the universe".
Exactly. That is *precisely* the reason why, in this application, I always
create files that have unique names that have not existed "since the beginning
of the universe".
But, I thought that was apparent, since NFS operations cannot guarantee
any ordering in a multi-client case *unless* every client uses its own
unique files. If you were trying to use the same file for the operation,
then this whole discussion would have been futile.
-- Sincerely, srb@cuci.nl Stephen R. van den Berg (AKA BuGless).A sign seen at the local pizza place: "DO NOT CARRY TAKE-OUT BOXES BY HANDLES"