Re: [BUG REPORT] btrfs/io_uring: GPF in tctx_task_work_run after encoded read error completion

From: Jens Axboe

Date: Tue Jun 30 2026 - 15:02:58 EST

On 6/30/26 3:16 AM, Yue Sun wrote:
> Hello,
>
> I can reproduce a general protection fault on current upstream master by using
> IORING_OP_URING_CMD with BTRFS_IOC_ENCODED_READ on a loop-backed btrfs image
> while fail_make_request injects read errors.
>
> Summary
> -------
>
> The crash happens while io_uring is running task_work for a btrfs encoded read
> completion:
>
> tctx_task_work_run()
> mutex_lock(&ctx->uring_lock)
>
> The faulting mutex address is poisoned:
>
> RDI: dead000000001129
> KASAN: maybe wild-memory-access in range [0xdead000000001128-0xdead00000000112f]
>
> The root cause might be a double-completion/use-after-free race in the
> btrfs io_uring encoded read error path.
>
> The timing appears to be:
>
> # CPU0: userspace task issues IORING_OP_URING_CMD.
> io_uring_enter()
> btrfs_uring_cmd()
> btrfs_uring_encoded_read()
> ret = btrfs_encoded_read(...)
> if (ret == -EIOCBQUEUED)
> btrfs_uring_read_extent(..., cmd)
>
> btrfs_uring_read_extent()
> priv->cmd = cmd
> ret = btrfs_encoded_read_regular_fill_pages(..., priv)
>
> # In this helper, priv is struct btrfs_encoded_read_private.
> # uring_ctx points to the caller's struct btrfs_uring_priv.
> btrfs_encoded_read_regular_fill_pages(..., uring_ctx=priv)
> refcount_set(&priv->pending_refs, 1)
> priv->uring_ctx = uring_ctx
> refcount_inc(&priv->pending_refs)
> btrfs_submit_bbio(bbio, 0)
>
> # CPU1: the submitted bio fails quickly, before CPU0 drops its owner ref.
> btrfs_encoded_read_endio()
> WRITE_ONCE(priv->status, bbio->bio.bi_status)
> refcount_dec_and_test(&priv->pending_refs)
> # pending_refs goes 2 -> 1, so this context does not queue completion.
>
> # CPU0: btrfs_submit_bbio() has returned and the uring branch continues.
> btrfs_encoded_read_regular_fill_pages(..., uring_ctx=priv)
> if (refcount_dec_and_test(&priv->pending_refs)) {
> ret = blk_status_to_errno(READ_ONCE(priv->status))
> btrfs_uring_read_extent_endio(uring_ctx, ret)
> kfree(priv)
> return ret
> }
>
> # Here priv is the caller's struct btrfs_uring_priv.
> btrfs_uring_read_extent_endio(priv, err)
> bc->priv = priv
> io_uring_cmd_complete_in_task(priv->cmd, btrfs_uring_read_finished)
>
> # CPU0: task_work is queued, but the helper returns a normal error instead
> # of -EIOCBQUEUED, so the caller takes the synchronous failure path.
> btrfs_uring_read_extent()
> if (ret && ret != -EIOCBQUEUED)
> goto out_fail
> out_fail:
> btrfs_unlock_extent(...)
> btrfs_inode_unlock(...)
> kfree(priv)
> __free_page(...)
> kfree(pages)
> return ret
>
> # Later, the same task waits for io_uring completions and runs task_work.
> io_uring_enter()
> io_cqring_wait()
> io_run_task_work()
> task_work_run()
> tctx_task_work()
> tctx_task_work_run()
> req = container_of(node, struct io_kiocb, io_task_work.node)
> ctx = req->ctx
> mutex_lock(&ctx->uring_lock)
> # Crash: req->ctx appears poisoned/stale before
> # btrfs_uring_read_finished() is reached.

If the work is passed to task_work, then btrfs must return -EIOCBQUEUED.
Looks like a basic bug in btrfs, see below. Caveat - entirely
untested/compiled/whatever. On vacation, btrfs guys can figure this out.

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 272598f6ae77..51c06618c733 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -9460,7 +9460,6 @@ int btrfs_encoded_read_regular_fill_pages(struct btrfs_inode *inode,
ret = blk_status_to_errno(READ_ONCE(priv->status));
btrfs_uring_read_extent_endio(uring_ctx, ret);
kfree(priv);
- return ret;
}

return -EIOCBQUEUED;

--
Jens Axboe