I agree that it's fairly bad. But, if you make sure that you're the
only client operating on a certain file which has never existed before,
the results aren't so bad (if the NFS-client and server are a good quality
implementation).
>As I recall, one particularly entertaining scenario which would mess up
>the using link to lock follow:
Maybe he hasn't kept up with technology, but about six years ago
I already came up with an algorithm that outwits NFS (in this case).
>CLIENT SERVER
>
>sends request link(fileA, fileB)
> receives request
> fileA exists, fileB does not
> server performs link and
> returns success.
>
> reply is lost and client does not see it
>
>Client re-sends the request to
>link fileA to fileB
> server again receives the request
> Both fileA and fileB exist
> link fails. Server replies
>This time the client receives the
>reply. The client believes that the
>link failed (i.e. it was already
>locked), but in fact it succeeded.
That's what a traditional program would have thought, yes. The locking
method I've been using in procmail (since 1991) works roughly as follows:
procmail client server
create unique new
file a (unique in time
and space) using
a regular open() call
create file (a)
link(a,b)
ask server to link(a,b)
make link(a,b)
server crashes, ack lost
server comes back up
retry link(a,b)
file b already exists
deny link, send back result
link(a,b) failed,
flush attribute cache
for files a and b
since they *may*
be incorrect now
(if link() had succeeded, best to flush the cache,
or increase the st_nlink count by one; not that it
really matters to procmail, since it skips the
stat in that case)
link(a,b) returns
error
Procmail ignores the
"failed" return-code
because it knows it's
unreliable
stat(a)
stat(a)
stat(a)
server crashes, ack lost
server comes back up
retry stat(a)
return stat(a) results
return stat(a)
Check st_nlink,
if it's equal to
two, the link(a,b)
succeeded.
unlink(a)
....etc.
>The moral of this tale ?
Given the right algorithm, NFS can be outwitted (in some cases).
>Don't expect too much of NFS (and almost anything is too much).
>It was a quick and nasty hack. It was implemented over the wrong transport
>because, at the time, Sun's TCP performance sucked (a problem that was
>subsequently addressed). Had it used TCP, a lot of the out of order nonsense
>would not be an issue.
Hardly. The TCP connection would simply have been broken in the
case of a server crash. The link() will still succeed the first time.
Causing the same problem upon server recovery.
-- Sincerely, srb@cuci.nl Stephen R. van den Berg (AKA BuGless)."-- hit any user to continue"