Re: FS-Cache: Duplicate cookie detected

From: Christian Kujau
Date: Tue Mar 12 2019 - 02:07:42 EST


On Mon, 11 Mar 2019, David Howells wrote:
> I've a couple more patches for you - one a bugfix and one that will print more
> information. They don't actually affect the problem you're seeing. I'll post
> them as replies to this message.

Thanks for the patches. I've applied all three to v5.0 and ran
"nfstest_cache" and was able to reproduce the messages. Please note that
I'm only running "nfstest_cache" because it's somehow able to reproduce
the message reliably - otherwise the message just shows up once or twice
in syslog, but I didn't know how to reproduce it.

But I noticed something else this time, and I did not notice that before:
while running nfstest_cache, the "duplicate cookie" messages were only
triggered when my other, non-test mount was also mounted during the test.
Let me describe my F29 test VM again:

* VM boots, and /usr/local/src gets mounted via NFS, read-only, and
with w/o fsc options. cachefilesd isn't even installed here.

* I run nfstest_cache and apparently it's mounting the same NFS export
from the server to /mnt/t, as a readonly mount.

So two mounts, one in /usr/local/src, the other in /mnt/t, both readonly
and both w/o "fsc", but the "duplicate cookie" message is only printed
when /usr/local/src was mounted. If /usr/local/src wasn't mounted, the
test would complete[0] and no "duplicate message was printed. And then I
noticed:

----------------------------------------------------------
$ mount | tail -2 | fold
horus:/usr/local/src on /usr/local/src type nfs4
(ro,relatime,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,
hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=192.168.56.139,
local_lock=none,addr=192.168.0.115)

horus:/ on /mnt/t type nfs4
(rw,relatime,vers=4.1,rsize=4096,wsize=4096,namlen=255,
hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=192.168.56.139,
local_lock=none,addr=192.168.0.115)
----------------------------------------------------------

My /usr/local/src mount was mounted with vers=4.2 (default), while
nfstest_cache was mounting its test-mount with vers=4.1! Apart from the
different rsize/wsize values, the version number stood out. And indeed,
when I mount my regular NFS mount /usr/local/src with vers=4.1, the
"duplicate cookie" is no longer printed.

For simplicity, I've attached two logs to this email:

* nfs_no-mount.txt.xz - showing /proc/fs/nfsfs/volumes and
/proc/fs/fscache/stats every 0.01 seconds, while running nfstest_cache
in another terminal. Note that no duplicate "cookie messages" were
triggered, as /usr/local/src was not mounted.

* nfs_with-mount.txt.xz - same, but here /usr/local/src was mounted (and
defaulted to vers=4.2), and thus "duplicate cookie" messages were
printed.


I fear that all this may complicate this strange behaviour, and now we're
examining NFS mount versions, but I only noticed that now, not earlier :-\

I can't comment on the patches much, as you mentioned they won't make the
message go away, but I hope it printed more details now.

Thanks,
Christian.


[0] Again, I'm using nfstest_cache only to trigger the message. Everytime
I execute it, the test fails, because I think it expects a rw-mount:

$ nfstest_cache --server horus --client fedora0 --runtest=acregmin_attr
*** Verify consistency of attribute caching with NFSv4.1 on a file
acregmin = 10
TEST: Running test 'acregmin_attr'
FAIL: Traceback (most recent call last):
File "/usr/bin/nfstest_cache", line 199, in do_file_test
fdw = open(self.absfile, "w")
IOError: [Errno 30] Read-only file system:
'/mnt/t/nfstest_cache_20190311223404_f_1'
TIME: 4.497078s
1 tests (0 passed, 1 failed)
Total time: 5.529826s

--
BOFH excuse #209:

Only people with names beginning with 'A' are getting mail this week (a la Microsoft)

Attachment: nfs_with-mount.txt.xz
Description: application/xz

Attachment: nfs_no-mount.txt.xz
Description: application/xz