Re: 9p fs-cache tests/benchmark (was: 9p fscache Duplicate cookie detected)

From: Christian Schoenebeck
Date: Mon Apr 11 2022 - 09:44:41 EST


On Montag, 11. April 2022 00:54:47 CEST asmadeus@xxxxxxxxxxxxx wrote:
> Thanks for keeping it up!
>
> Christian Schoenebeck wrote on Sun, Apr 10, 2022 at 06:18:38PM +0200:
> > > I used git-bisect to identify the commit that broke 9p behaviour, and it
> > > is
> > > indeed this one:
> > >
> > > commit eb497943fa215897f2f60fd28aa6fe52da27ca6c (HEAD, refs/bisect/bad)
> > > Author: David Howells <dhowells@xxxxxxxxxx>
> > > Date: Tue Nov 2 08:29:55 2021 +0000
> > >
> > > 9p: Convert to using the netfs helper lib to do reads and caching
>
> Yes, quite a few things changed with that.
>
> > I looked into the errors I get, and as far as I can see it, all
> > misbehaviours that I see, boil down to "Bad file descriptor" (EBADF)
> > errors being the originating cause.
> >
> > The easiest misbehaviours on the guest system I can look into, are errors
>
> > with the git client. For instance 'git fetch origin' fails this way:
> FWIW I didn't report but did try to reproduce, on my machines (tried a
> couple) booting on a small alpine rootfs over 9p works, and I tried some
> git clone/git fetch of variying sizes of local repo (tmpfs in VM -> 9p
> mount of tmpfs on host) to no avail.

I get more convinced that it's a bug on Linux kernel side. When guest is
booted I always immediately get a read("/var/log/wtmp") = EBADF error on
guest. And the 9p command sequence sent to QEMU 9p server were:

...
v9fs_clunk tag 0 id 120 fid 568
v9fs_walk tag 0 id 110 fid 1 newfid 568 nwnames 1
v9fs_rerror tag 0 id 110 err 2
v9fs_walk tag 0 id 110 fid 26 newfid 568 nwnames 1
v9fs_rerror tag 0 id 110 err 2
v9fs_readlink tag 0 id 22 fid 474
v9fs_readlink_return tag 0 id 22 name /run
v9fs_readlink tag 0 id 22 fid 474
v9fs_readlink_return tag 0 id 22 name /run
v9fs_readlink tag 0 id 22 fid 474
v9fs_readlink_return tag 0 id 22 name /run
v9fs_readlink tag 0 id 22 fid 474
v9fs_readlink_return tag 0 id 22 name /run
v9fs_walk tag 0 id 110 fid 633 newfid 568 nwnames 1
v9fs_rerror tag 0 id 110 err 2
v9fs_walk tag 0 id 110 fid 875 newfid 568 nwnames 0
v9fs_walk_return tag 0 id 110 nwnames 0 qids (nil)
v9fs_open tag 0 id 12 fid 568 mode 32769
v9fs_open_return tag 0 id 12 qid={type 0 version 0 path 820297} iounit 507904
v9fs_walk tag 0 id 110 fid 875 newfid 900 nwnames 0
v9fs_walk_return tag 0 id 110 nwnames 0 qids (nil)
v9fs_open tag 0 id 12 fid 900 mode 2
v9fs_open_return tag 0 id 12 qid={type 0 version 0 path 820297} iounit 507904
v9fs_lock tag 0 id 52 fid 568 type 1 start 0 length 0
v9fs_lock_return tag 0 id 52 status 0
v9fs_xattrwalk tag 0 id 30 fid 568 newfid 901 name security.capability
v9fs_rerror tag 0 id 30 err 95
v9fs_read tag 0 id 116 fid 568 off 192512 max_count 256

So guest opens /var/log/wtmp with fid=568 mode=32769, which is write-only
mode, and then it tries to read that fid 568, which eventually causes the
read() call on host to error with EBADF. Which makes sense, as the file was
opened in write-only mode, hence read() is not possible with that file
descriptor.

The other things I noticed when looking at the 9p command sequence above:
there is a Twalk on fid 568 before, which is not clunked before reusing fid
568 with Topen later. And before that Twalk on fid 568 there is a Tclunk on
fid 568, but apparently that fid was not used before.

> Perhaps backing filesystem dependant? qemu version? virtfs access options?

I tried with different hardware and different file systems (ext4, btrfs), same
misbehaviours.

QEMU is latest git version. I also tried several different QEMU versions, same
thing.

QEMU command line used:

~/git/qemu/build/qemu-system-x86_64 \
-machine pc,accel=kvm,usb=off,dump-guest-core=off -m 16384 \
-smp 8,sockets=8,cores=1,threads=1 -rtc base=utc -boot strict=on \
-kernel ~/vm/bullseye/boot/vmlinuz \
-initrd ~/vm/bullseye/boot/initrd.img \
-append 'root=fsRoot rw rootfstype=9p rootflags=trans=virtio,version=9p2000.L,msize=4186112,cache=loose console=ttyS0' \
-fsdev local,security_model=mapped,multidevs=remap,id=fsdev-fs0,path=$HOME/vm/bullseye/ \
-device virtio-9p-pci,id=fs0,fsdev=fsdev-fs0,mount_tag=fsRoot \
-sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny \
-nographic

Important for reproducing this issue:

* cache=loose
* -smp N (with N>1)
* Guest booted with Linux kernel containing commit eb497943fa21
(uname >= 5.16)

I'm pretty sure that you can reproduce this issue with the QEMU 9p rootfs
setup HOWTO linked before.

> It's all extremely slow though... like the final checkout counting files
> at less than 10/s

It is VERY slow. And the weird thing is that cache=loose got much slower than
cache=mmap. My worst case expactation would be cache=loose at least not
performing worse than cache=mmap.

Best regards,
Christian Schoenebeck