KCSAN: data-race in __bio_queue_enter / __blk_mq_unfreeze_queue

From: Jianzhou Zhao

Date: Wed Mar 11 2026 - 04:11:13 EST

Subject: [BUG] blk-core: KCSAN: data-race in __bio_queue_enter / __blk_mq_unfreeze_queue

Dear Maintainers,

We are writing to report a KCSAN-detected data race vulnerability within the block layer's freeze/unfreeze locking mechanisms. This bug was found by our custom fuzzing tool, RacePilot. The race occurs when `__blk_mq_unfreeze_queue()` decrements the `q->mq_freeze_depth` counter protected by the `mq_freeze_lock` mutex, while `__bio_queue_enter()` simultaneously executes a lockless read check against `q->mq_freeze_depth` within the `wait_event` condition evaluation. We observed this bug on the Linux kernel version 6.18.0-08691-g2061f18ad76e-dirty.

Call Trace & Context
==================================================================
BUG: KCSAN: data-race in __bio_queue_enter / __blk_mq_unfreeze_queue

write to 0xffff88800b03e534 of 4 bytes by task 13701 on cpu 1:
__blk_mq_unfreeze_queue+0x6f/0x140 block/blk-mq.c:226
blk_mq_unfreeze_queue_nomemrestore+0x17/0x20 block/blk-mq.c:242
blk_mq_unfreeze_queue include/linux/blk-mq.h:960 [inline]
loop_set_status+0x371/0x580 drivers/block/loop.c:1268
...
__x64_sys_ioctl+0x121/0x170 fs/ioctl.c:591

read to 0xffff88800b03e534 of 4 bytes by task 4781 on cpu 0:
__bio_queue_enter+0x32a/0x620 block/blk-core.c:353
bio_queue_enter block/blk.h:93 [inline]
blk_mq_submit_bio+0xb57/0x1360 block/blk-mq.c:3172
__submit_bio+0x16f/0x580 block/blk-core.c:637
...
submit_bio+0x1b0/0x260 block/blk-core.c:921
...
__x64_sys_read+0x41/0x50 fs/read_write.c:737

value changed: 0x00000001 -> 0x00000000

Reported by Kernel Concurrency Sanitizer on:
CPU: 0 UID: 0 PID: 4781 Comm: systemd-udevd Not tainted 6.18.0-08691-g2061f18ad76e-dirty #50 PREEMPT(voluntary)
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
==================================================================

Execution Flow & Code Context
When adjusting device configurations (e.g., `loop_set_status()`), the subsystem invokes `blk_mq_unfreeze_queue()`, terminating the freezing sequence by acquiring `q->mq_freeze_lock` and decrementing `q->mq_freeze_depth`:
```c
// block/blk-mq.c
bool __blk_mq_unfreeze_queue(struct request_queue *q, bool force_atomic)
{
...
mutex_lock(&q->mq_freeze_lock);
...
q->mq_freeze_depth--; // <-- Concurrent 4-byte write (mutating to 0 via lock)
WARN_ON_ONCE(q->mq_freeze_depth < 0);
if (!q->mq_freeze_depth) {
percpu_ref_resurrect(&q->q_usage_counter);
wake_up_all(&q->mq_freeze_wq);
}
...
}
```

Simultaneously, incoming normal bio submissions block dynamically via `__bio_queue_enter()` if the queue enforces a freeze. This path calls `wait_event` against `q->mq_freeze_wq` mapping a condition checking if `!q->mq_freeze_depth`:
```c
// block/blk-core.c
int __bio_queue_enter(struct request_queue *q, struct bio *bio)
{
while (!blk_try_enter_queue(q, false)) {
...
smp_rmb();
wait_event(q->mq_freeze_wq,
(!q->mq_freeze_depth && // <-- Concurrent lockless 4-byte read
blk_pm_resume_queue(false, q)) ||
test_bit(GD_DEAD, &disk->state));
...
}
```

Root Cause Analysis
A KCSAN data race materializes because `__bio_queue_enter` issues uncontrolled lockless reads of `q->mq_freeze_depth` through the `wait_event` condition evaluation loop while `__blk_mq_unfreeze_queue` executes a standard deduction operation protected exclusively by `mq_freeze_lock`. The unannotated decrement and lockless read trigger consecutive memory tearing complaints by compiler sanitizers recognizing memory incoherencies under optimization structures or multi-core delayed visibilities, even though the internal logic of the queue state typically compensates alongside barrier constructs like `smp_rmb()`.
Unfortunately, we were unable to generate a reproducer for this bug.

Potential Impact
Although the block layer implements resilient logic that withstands explicit synchronization races utilizing retries or internal barrier pairings (meaning practical system consequences are highly unlikely), omitting compiler atomic operations triggers ubiquitous KCSAN diagnostic pollution on a highly frequent kernel datapath.

Proposed Fix
To establish explicit boundaries and quench KCSAN runtime tearing errors, `WRITE_ONCE` and `READ_ONCE` can be introduced to formally denote the volatile nature of `mq_freeze_depth` when queried locklessly and manipulated under independent contexts. A concrete mitigation strategy is modifying the counter decrement implementation alongside the predicate polling logic:

```diff
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -223,6 +223,7 @@ bool __blk_mq_unfreeze_queue(struct request_queue *q, bool force_atomic)
q->q_usage_counter.data->force_atomic = true;
}
- q->mq_freeze_depth--;
+ WRITE_ONCE(q->mq_freeze_depth, q->mq_freeze_depth - 1);
WARN_ON_ONCE(q->mq_freeze_depth < 0);

--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -350,7 +350,7 @@ int __bio_queue_enter(struct request_queue *q, struct bio *bio)
smp_rmb();
wait_event(q->mq_freeze_wq,
- (!q->mq_freeze_depth &&
+ (!READ_ONCE(q->mq_freeze_depth) &&
blk_pm_resume_queue(false, q)) ||
test_bit(GD_DEAD, &disk->state));
```

We would be highly honored if this could be of any help.

Best regards,
RacePilot Team