Re: [PATCH v4 5/5] blktrace: Make init_blk_tracer() asynchronous when trace_async_init set

From: Yaxiong Tian

Date: Thu Jan 29 2026 - 22:09:52 EST

在 2026/1/30 09:35, Yaxiong Tian 写道:

在 2026/1/30 04:29, Steven Rostedt 写道:

On Wed, 28 Jan 2026 19:25:46 -0700
Jens Axboe <axboe@xxxxxxxxx> wrote:

On Jan 28, 2026, at 5:40 PM, Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:

Jens,

Can you give me an acked-by on this patch and I can take the series through
my tree.

On phone, hope this works:

Acked-by: Jens Axboe <axboe@xxxxxxxxx>

Thanks!

Or perhaps this doesn't even need to test the trace_async_init flag and can
always do the work queue? Does blk_trace ever do tracing at boot up? That
is, before user space starts?

Not via the traditonal way of running blktrace.

Masami and Yaxiong,

I've been thinking about this more and I'm not sure we need the
trace_async_init kernel parameter at all. As blktrace should only be
enabled by user space, it can always use the work queue.

For kprobes, if someone is adding a kprobe on the kernel command line, then
they are already specifying that tracing is more important.

Patch 3 already keeps kprobes from being an issue with contention of the
tracing locks, so I don't think it ever needs to use the work queue.

Wouldn't it just be better to remove the trace_async_init and make blktrace
always use the work queue and kprobes never do it (but exit out early if
there were no kprobes registered)?

That is, remove patch 2 and 4 and make this patch always use the work queue.

Yesterday, I was curious about|trace_event_update_all()|, so I added|pr_err(xx)|prints within the function's loop. I discovered that these prints appeared as late as 14 seconds later (printing is time-consuming), by which time the desktop had already been up for quite a while. However,|trace_eval_sync()|had already finished running at 0.6 seconds.

This implies that I originally thought|trace_eval_sync()|'s|destroy_workqueue()|would wait for all tasks to complete, but it seems that might not be the case. From this, if the above conclusion is true, then strictly speaking, tasks using|queue_work(xx)|cannot be guaranteed to finish before the init process executes. If it's necessary to strictly ensure initialization completes before user space starts, using|async_synchronize_full()|or|async_synchronize_full_domain()|would be better in such scenarios.

I need to double-check this issue—theoretically, it shouldn't exist. But I'm not sure why the print appeared at the 14-second mark.

Of course, the situation described above is an extreme case. I don't oppose this approach; I only hope to make the startup faster for ordinary users who don’t use trace, while minimizing the impact on others as much as possible.

-- Steve