loop subsystem corrupted after mounting multiple btrfs sub-volumes

From: Stanislav Brabec
Date: Thu Feb 25 2016 - 14:23:03 EST


While writing a test suite for util-linux[1], I experienced a a strange
behavior of loop device:

When two loop devices refer to the same file, and two btrfs mounts are
called on them, the second mount changes loop device of the first,
already mounted sub-volume. (Note that the current implementation of
util-linux mount -oloop works exactly in this way, and it allocates new
loop device for each mount command, so this bug can be easily
reproduced without losetup, just using "mount -oloop" or fstab.)

/proc/self/mountinfo after first btrfs loop mount:

107 59 0:59 /d0/dd0/ddd0/s1/d1/dd1/ddd1/s2 /mnt/1 rw,relatime shared:45 - btrfs /dev/loop0 rw,space_cache,subvolid=257,subvol=/d0/dd0/ddd0/s1/d1/dd1/ddd1/s2

This line changes after second first btrfs loop to:

07 59 0:59 /d0/dd0/ddd0/s1/d1/dd1/ddd1/s2 /mnt/1 rw,relatime shared:45 - btrfs /dev/loop1 rw,space_cache,subvolid=257,subvol=/d0/dd0/ddd0/s1/d1/dd1/ddd1/s2

See the change of /dev/loop0 to /dev/loop1!

It is apparently not only proc file change, but it also causes a
corruption of loop device subsystem, as I observed severe problems
on the affected system later:

- mount(2) returning 0 but doing nothing.

- mount(8) entering an infinite loop while searching for free loop
device.


Here is a main reproducer:

=====================
#!/bin/sh
# Prepare the environment:
/btrfs.sh
mkdir -p /mnt/1 /mnt/2
losetup /dev/loop0 /btrfs.img
# Verify that nothing is mounted:
cat /proc/self/mountinfo | grep /mnt
mount /dev/loop0 /mnt/1
echo "One file system should be mounted now."
cat /proc/self/mountinfo | grep /mnt
# Create another loop.
losetup /dev/loop1 /btrfs.img
echo "Going to mount second one."
mount -osubvol=/ /dev/loop1 /mnt/2 2>&1
echo "Two file system should be mounted now."
cat /proc/self/mountinfo | grep /mnt
echo "Strange. First mount changed its loop device!"
umount /mnt/2
echo "And now check, whether it remains changed after umount."
cat /proc/self/mountinfo | grep /mnt
umount /mnt/1
losetup -d /dev/loop1
losetup -d /dev/loop0
rmdir /mnt/1 /mnt/2
=====================

And here is its output:

One file system should be mounted now.
107 59 0:59 /d0/dd0/ddd0/s1/d1/dd1/ddd1/s2 /mnt/1 rw,relatime shared:45 - btrfs /dev/loop0 rw,space_cache,subvolid=257,subvol=/d0/dd0/ddd0/s1/d1/dd1/ddd1/s2
Going to mount second one.
Two file system should be mounted now.
107 59 0:59 /d0/dd0/ddd0/s1/d1/dd1/ddd1/s2 /mnt/1 rw,relatime shared:45 - btrfs /dev/loop1 rw,space_cache,subvolid=257,subvol=/d0/dd0/ddd0/s1/d1/dd1/ddd1/s2
108 59 0:59 / /mnt/2 rw,relatime shared:47 - btrfs /dev/loop1 rw,space_cache,subvolid=5,subvol=/
Strange. First mount changed its loop device!
And now check, whether it remains changed after umount.
107 59 0:59 /d0/dd0/ddd0/s1/d1/dd1/ddd1/s2 /mnt/1 rw,relatime shared:45 - btrfs /dev/loop1 rw,space_cache,subvolid=257,subvol=/d0/dd0/ddd0/s1/d1/dd1/ddd1/s2

It was actually reproduced on linux-4.4.1 on openSUSE Tumbleweed.


Test image creator:

===== /btrfs.sh =====
#!/bin/sh
truncate -s 42M /btrfs.img
mkfs.btrfs -f -d single -m single /btrfs.img >/dev/null
mount -o loop /btrfs.img /mnt
pushd . >/dev/null
cd /mnt
mkdir -p d0/dd0/ddd0
cd ./d0/dd0/ddd0
touch file{1..5}
btrfs subvol create s1 >/dev/null
cd ./s1
touch file{1..5}
mkdir bind-point
mkdir -p d1/dd1/ddd1
cd ./d1/dd1/ddd1
btrfs subvol create s2 >/dev/null
DEFAULT_SUBVOLID=$(btrfs inspect rootid s2)
btrfs subvol set-default $DEFAULT_SUBVOLID . >/dev/null
NON_DEFAULT_SUBVOLID=$(btrfs subvol list /mnt |
while read dummy id rest ; do if test $id = $DEFAULT_SUBVOLID ; then
continue ; fi ; echo $id ; done)
cd ../../../..
mkdir -p d2/dd2/ddd2
cd ./d2/dd2/ddd2
btrfs subvol create s3 >/dev/null
mkdir -p s3/bind-mnt
popd >/dev/null
NON_DEFAULT_SUBVOL=d0/dd0/ddd0/d2/dd2/ddd2/s3
umount /mnt
=====================

[1] http://marc.info/?l=util-linux-ng&m=145590643206663&w=2

--
Best Regards / S pozdravem,

Stanislav Brabec
software developer
---------------------------------------------------------------------
SUSE LINUX, s. r. o. e-mail: sbrabec@xxxxxxxx
Lihovarská 1060/12 tel: +49 911 7405384547
190 00 Praha 9 fax: +420 284 084 001
Czech Republic http://www.suse.cz/
PGP: 830B 40D5 9E05 35D8 5E27 6FA3 717C 209F A04F CD76