Re: bcachefs page fault oops after device evacuate/remove and unmount

From: Kent Overstreet
Date: Fri Dec 01 2023 - 00:27:18 EST


On Thu, Nov 30, 2023 at 10:25:48PM -0500, Kent Overstreet wrote:
> On Thu, Nov 30, 2023 at 08:47:33PM -0500, Daniel J Blueman wrote:
> > Hi Kent et al,
> >
> > On upstream bcachefs (master @ evilpiepirate.org/git/bcachefs.git) SHA
> > f8a1ba26, I was able to develop a minimal reproducer [1] for a page
> > not present oops I can provoke [2]. It appears we need further
> > synchronisation during unmount.
> >
> > Let me know when there is a patch I can test, or for debug.
> >
> > Thanks,
> > Dan
> >
> > -- [1]
> >
> > modprobe brd rd_size=536870912 rd_nr=2
> > bcachefs format -f /dev/ram0 /dev/ram1
> > mount -t bcachefs /dev/ram0:/dev/ram1 /mnt
> > fio --group_reporting --ioengine=io_uring --directory=/mnt --size=16m
> > --time_based --runtime=60s --iodepth=256 --verify_async=8 --bs=4k-64k
> > --norandommap --random_distribution=zipf:0.5 --numjobs=16 --rw=randrw
> > --name=A --direct=1 --name=B --direct=0 >/dev/null &
> > bcachefs device evacuate /dev/ram0
> > bcachefs device remove --force --force-metadata /dev/ram1
> > bcachefs device remove --force --force-metadata /dev/ram1
> > kill %1
> > wait
> > umount /mnt
>
> The remove fails for me with DEVICE_SET_STATE_NOT_ALLOWED - evacuate set
> ram0 to ro, we can't remove our last rw dev.

Got it to repro, just had to ignore those errors - there's now a ktest
test for it, and the fix is in master :)