Re: [PATCH 4/8] io_uring/io-wq: cache work->flags in variable

From: Pavel Begunkov
Date: Wed Jan 29 2025 - 13:56:54 EST

Next message: Eric Biggers: "Re: [PATCH 1/6] RDMA/rxe: handle ICRC correctly on big endian systems"
Previous message: Shakeel Butt: "[PATCH] cgroup: fix race between fork and cgroup.kill"
In reply to: Max Kellermann: "[PATCH 4/8] io_uring/io-wq: cache work->flags in variable"
Next in thread: Max Kellermann: "Re: [PATCH 4/8] io_uring/io-wq: cache work->flags in variable"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 1/28/25 13:39, Max Kellermann wrote:

This eliminates several redundant atomic reads and therefore reduces
the duration the surrounding spinlocks are held.

What architecture are you running? I don't get why the reads
are expensive while it's relaxed and there shouldn't even be
any contention. It doesn't even need to be atomics, we still
should be able to convert int back to plain ints.

In several io_uring benchmarks, this reduced the CPU time spent in
queued_spin_lock_slowpath() considerably:

io_uring benchmark with a flood of `IORING_OP_NOP` and `IOSQE_ASYNC`:

38.86% -1.49% [kernel.kallsyms] [k] queued_spin_lock_slowpath
6.75% +0.36% [kernel.kallsyms] [k] io_worker_handle_work
2.60% +0.19% [kernel.kallsyms] [k] io_nop
3.92% +0.18% [kernel.kallsyms] [k] io_req_task_complete
6.34% -0.18% [kernel.kallsyms] [k] io_wq_submit_work

HTTP server, static file:

42.79% -2.77% [kernel.kallsyms] [k] queued_spin_lock_slowpath
2.08% +0.23% [kernel.kallsyms] [k] io_wq_submit_work
1.19% +0.20% [kernel.kallsyms] [k] amd_iommu_iotlb_sync_map
1.46% +0.15% [kernel.kallsyms] [k] ep_poll_callback
1.80% +0.15% [kernel.kallsyms] [k] io_worker_handle_work

HTTP server, PHP:

35.03% -1.80% [kernel.kallsyms] [k] queued_spin_lock_slowpath
0.84% +0.21% [kernel.kallsyms] [k] amd_iommu_iotlb_sync_map
1.39% +0.12% [kernel.kallsyms] [k] _copy_to_iter
0.21% +0.10% [kernel.kallsyms] [k] update_sd_lb_stats

Signed-off-by: Max Kellermann <max.kellermann@xxxxxxxxx>

--
Pavel Begunkov

Next message: Eric Biggers: "Re: [PATCH 1/6] RDMA/rxe: handle ICRC correctly on big endian systems"
Previous message: Shakeel Butt: "[PATCH] cgroup: fix race between fork and cgroup.kill"
In reply to: Max Kellermann: "[PATCH 4/8] io_uring/io-wq: cache work->flags in variable"
Next in thread: Max Kellermann: "Re: [PATCH 4/8] io_uring/io-wq: cache work->flags in variable"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]