Re: Possible kernel fs block code regression in 6.2.3 umounting usb drives

From: Jens Axboe
Date: Fri Mar 10 2023 - 15:23:59 EST


On 3/10/23 1:16 PM, Eric Biggers wrote:
> On Fri, Mar 10, 2023 at 12:14:10PM -0800, Eric Biggers wrote:
>> On Fri, Mar 10, 2023 at 07:33:37PM +0000, Mike Cloaked wrote:
>>> With kerne. 6.2.3 if I simply plug in a usb external drive, mount it
>>> and umount it, then the journal has a kernel Oops and I have submitted
>>> a bug report, that includes the journal output, at
>>> https://bugzilla.kernel.org/show_bug.cgi?id=217174
>>>
>>> As soon as the usb drive is unmounted, the kernel Oops occurs, and the
>>> machine hangs on shutdown and needs a hard reboot.
>>>
>>> I have reproduced the same issue on three different machines, and in
>>> each case downgrading back to kernel 6.2.2 resolves the issue and it
>>> no longer occurs.
>>>
>>> This would seem to be a regression in kernel 6.2.3
>>>
>>> Mike C
>>
>> Thanks for reporting this! If this is reliably reproducible and is known to be
>> a regression between v6.2.2 and v6.2.3, any chance you could bisect it to find
>> out the exact commit that introduced the bug?
>>
>> For reference I'm also copying the stack trace from bugzilla below:
>>
>> BUG: kernel NULL pointer dereference, address: 0000000000000028
>> #PF: supervisor read access in kernel mode
>> #PF: error_code(0x0000) - not-present page
>> PGD 0 P4D 0
>> Oops: 0000 [#1] PREEMPT SMP PTI
>> CPU: 9 PID: 1118 Comm: lvcreate Tainted: G T 6.2.3-1>
>> Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./Z370 Ex>
>> RIP: 0010:blk_throtl_update_limit_valid+0x1f/0x110
>
> BTW, the block/ commits between v6.2.2 and v6.2.3 were:
>
> blk-mq: avoid sleep in blk_mq_alloc_request_hctx
> blk-mq: remove stale comment for blk_mq_sched_mark_restart_hctx
> blk-mq: wait on correct sbitmap_queue in blk_mq_mark_tag_wait
> blk-mq: Fix potential io hung for shared sbitmap per tagset
> blk-mq: correct stale comment of .get_budget
> block: sync mixed merged request's failfast with 1st bio's
> block: Fix io statistics for cgroup in throttle path
> block: bio-integrity: Copy flags when bio_integrity_payload is cloned
> block: use proper return value from bio_failfast()
> blk-iocost: fix divide by 0 error in calc_lcoefs()
> blk-cgroup: dropping parent refcount after pd_free_fn() is done
> blk-cgroup: synchronize pd_free_fn() from blkg_free_workfn() and blkcg_deactivate_policy()
> block: don't allow multiple bios for IOCB_NOWAIT issue
> block: clear bio->bi_bdev when putting a bio back in the cache
> block: be a bit more careful in checking for NULL bdev while polling
>
> Without having any in-depth knowledge here, I think "blk-cgroup: synchronize
> pd_free_fn() from blkg_free_workfn() and blkcg_deactivate_policy()" looks the
> most suspicious here... I see that AUTOSEL selected it from a 3-patch series
> without backporting patch 2, maybe that could be it? Anyway, just a hunch.

Was just looking at this too, the primary suspects would indeed be those
two blk-cgroup changes. And yes, they ended up in stable due to auto
selection, and very odd how it picked 2 and not the 3rd?!

But I would revert:

bfe46d2efe46c5c952f982e2ca94fe2ec5e58e2a
57a425badc05c2e87e9f25713e5c3c0298e4202c

in that order from 6.2.3 and see if that helps. Adding Yu.

--
Jens Axboe