Re: [PATCH] init/do_mounts.c: Add root="fstag:<tag>" syntax for root device

From: Dominique Martinet
Date: Sun Jun 13 2021 - 07:57:25 EST


Dominique Martinet wrote on Thu, Jun 10, 2021 at 05:33:34PM +0900:
> Stefan Hajnoczi wrote on Thu, Jun 10, 2021 at 09:16:38AM +0100:
> > virtio-9p should be simple. I'm not sure how much additional setup the
> > other 9p transports require. TCP and RDMA seem doable if there are
> > kernel parameters to configure things before the root file system is
> > mounted.
>
> For TCP, we can probably piggyback on what nfs does for this, see the
> ip= parameter in Documentation/admin-guide/nfs/nfsroot.rst -- it lives
> in net/ipv4/ipconfig.c so should just work out of the box

Hm, just tried and it doesn't quite work for some reason -- in this
stack trace:
kthread_should_stop+0x71/0xb0
wait_woken+0x182/0x1c0
__inet_stream_connect+0x48a/0xc00
inet_stream_connect+0x53/0xa0
p9_fd_create_tcp+0x2d6/0x420
p9_client_create+0x7bc/0x11d0
v9fs_session_init+0x1cd/0x1220
v9fs_mount+0x8c/0x870
legacy_get_tree+0xef/0x1d0
vfs_get_tree+0x83/0x240
path_mount+0xda3/0x1800
init_mount+0x98/0xdd
do_mount_root+0xe0/0x255
mount_root+0x47/0xd7
prepare_namespace+0x136/0x165
kernel_init+0xd/0x123
ret_from_fork+0x22/0x30

current->set_child_tid is null, causing a null deref when checking
&to_kthread(current)->flags

It does work with nfsroot though so even if this doesn't look 9p
specific I guess I'll need to debug that eventually, but this can
be done later... I'm guessing they don't use the same connect() function
as 9p's is ipv4-specific (ugh) and that needs fixing eventually anyway.

For reference this is relevant part of kernel command line I used for
tcp:
root=fstag:x.y.z.t rootflags=trans=tcp,aname=rootfs rootfstype=9p ip=dhcp

(and ip=dhcp requires CONFIG_IP_PNP_DHCP=y)



Virtio does work quite well though and that's good enough for me -- I
was going to suggest also documenting increasing the msize (setting
e.g. rootflags=msize=262144) but we really ought to increase the
default, that came up recently and since no patch was sent I kind of
forgot... Will do that now.



@Vivek - I personally don't really care much, but would tend to prefer
your v2 (without fstag:) from a user perspective the later is definitely
better but I don't really like the static nobdev_filesystems array --
I'd bite the bullet and use FS_REQUIRES_DEV (and move this part of the
code just a bit below after the root_wait check just in case it matters,
but at that point if something would mount with /dev/root but not with
the raw root=argument then they probably do require a device!)

It could also be gated by a config option like e.g. CONFIG_ROOT_NFS and
others all are to make sure it doesn't impact anyone who doesn't want to
be impacted - I'm sure some people want to make sure their device
doesn't boot off a weird root if someone manages to change kernel params
so would want a way of disabling the option...


Well, if you keep the array, please add 9p to the list and resend as a
proper patch so I can reply with tested-by/reviewed-by tags on something
more final.


Also, matter-of-factedly, how is this going to be picked up?
Is the plan to send it directly to Linus as part of the next virtiofs
PR? Going through Al Viro?


Thanks,
--
Dominique