Problem with mounting nfs shares after sudden poweroutage, fstab mount procedure jumbles nfs mounts

From: Thomas Korimort
Date: Tue Aug 24 2021 - 02:29:56 EST

Next message: CGEL: "[PATCH linux-next] mips:mmu: fix boolreturn.cocci warnings"
Previous message: Christoph Hellwig: "Re: [PATCH 09/10] loop: add error handling support for add_disk()"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Hi!

I am already experiencing this for the second time and i feel that this
is a strange issue: I have an AMD Ryzen desktop PC with Debian Bullseye
(before Buster) and a JBOD disk tower with four fully occupied slots. On
my desktop i mount the 4 disks with ext4 file system as NFS shares via
/etc/fstab, which used to work nicely before the most recent sudden
power outage. After that the drives had to be checked and the inodes
repaired. The file system of one disk in use was destroyed and restored
through backup by copying the backup on the repaired disk. I also
changed the file permissions and ownership after the copy procedure
after a reboot. After that the mount procedure during system startup
happening in /etc/fstab did not mount anymore my nfs shares correctly.
The kernel mount procedure is waiting for 2 drives to mount and then
jumbles the nfs mounts somehow.

My /etc/fstab has this four entries related to the nfs shares:

10.10.10.2:/mnt/WD01     /mnt/WD01    nfs    rw,auto,nofail    0    0
10.10.10.2:/mnt/WD02    /mnt/WD02    nfs    rw,auto,nofail    0    0
10.10.10.2:/mnt/WD03    /mnt/WD03    nfs    rw,auto,nofail    0    0
10.10.10.2:/mnt/WD04    /mnt/WD04    nfs    rw,auto,nofail    0    0

and the actual mount in /proc/mounts lists like this

10.10.10.2:/mnt/WD04 /mnt/WD04 nfs4
rw,relatime,vers=4.2,rsize=524288,wsize=524288,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=10.10.10.1,local_lock=none,addr=10.10.10.2
0 0
10.10.10.2:/mnt/WD03 /mnt/WD03 nfs4
rw,relatime,vers=4.2,rsize=524288,wsize=524288,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=10.10.10.1,local_lock=none,addr=10.10.10.2
0 0
10.10.10.2:/mnt/WD02 /mnt/WD02 nfs4
rw,relatime,vers=4.2,rsize=524288,wsize=524288,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=10.10.10.1,local_lock=none,addr=10.10.10.2
0 0
10.10.10.2:/mnt/WD03 /mnt/WD01 nfs4
rw,relatime,vers=4.2,rsize=524288,wsize=524288,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=10.10.10.1,local_lock=none,addr=10.10.10.2
0 0

One can see that /mnt/WD03 gets mounted under /mnt/WD01 and there is no
mount entry for /mnt/WD01. The content of the two mounts seems to be
that of /mnt/WD01 (WD01 is a hand-maintained mirror of /mnt/WD02). What
is wrong here? Reboot does not change anything. This strange
constellation is carried over from reboot to reboot and i don't know
where i can find the run state or similar file that is jumbled up and
that is transferring the wrong mount information from reboot to reboot.
I looked in /var/run, /tmp aso. for files related to the kernel mount
procedure and fstab.

The mount procedure is waiting for 1,5 minutes for mounting WD01 and
WD03 and then jumbles the mounts. The drives itself are okay on my
Raspberry Pi 4 (Vanilla Debian Bullseye arm64 via image-specs script
some days ago) nfs server 10.10.10.2 and exportfs also lists ok
(10.10.10.1 exports are rw, others ro, other export options are sync and
no_root_squash for all exports):

/mnt/WD01         10.10.10.1
/mnt/WD02         10.10.10.1
/mnt/WD03         10.10.10.1
/mnt/WD04         10.10.10.1
/mnt/WD01         192.168.0.0/24
/mnt/WD01         192.168.1.0/24
/mnt/WD01         10.10.10.0/24

WD01 is exported into multiple IP adress spaces for my different router
and network switch configurations. It was working nicely till yesterday
before the sudden power outage and disk recovery.

Exactly this problem happened to me with my old Debian Buster desktop
installation and led to the same problem, when i had to replace a
harddisk and after a power outage. It seems to be a recurring problem
over the last decade of years as well as the famous pulseaudio sequencer
problem.

Greetings, Thomas Korimort.

Next message: CGEL: "[PATCH linux-next] mips:mmu: fix boolreturn.cocci warnings"
Previous message: Christoph Hellwig: "Re: [PATCH 09/10] loop: add error handling support for add_disk()"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]