Re: [RFC][PATCH] tracing: Introduce restart_critical_timings()

From: Mathieu Desnoyers
Date: Wed Mar 20 2024 - 14:42:53 EST

Next message: Andrew Lunn: "Re: [PATCH v4 1/5] dt-bindings: misc: Add mikrobus-connector"
Previous message: Guenter Roeck: "Re: [PATCH v2 2/5] ext4: Add unit test of ext4_mb_generate_buddy"
In reply to: Steven Rostedt: "Re: [RFC][PATCH] tracing: Introduce restart_critical_timings()"
Next in thread: Waiman Long: "Re: [RFC][PATCH] tracing: Introduce restart_critical_timings()"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 2024-03-20 13:58, Steven Rostedt wrote:

On Wed, 20 Mar 2024 13:15:39 -0400
Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxxxx> wrote:

I would like to introduce restart_critical_timings() and place it in
locations that have this behavior.

Is there any way you could move this to need_resched() rather than
sprinkle those everywhere ?

Because need_resched() itself does not mean it's going to schedule
immediately. I looked at a few locations that need_resched() is called.
Most are in idle code where the critical timings are already handled.

I'm not sure I'd add it for places like mm/memory.c or drivers/md/bcache/btree.c.

A lot of places look to use it more for PREEMPT_NONE situations as a open
coded cond_resched().

The main reason this one is particularly an issue, is that it spins as long
as the owner is still running. Which may be some time, as here it was 7ms.

What I think we should be discussing here is how calling need_resched()
should interact with the latency tracked by critical timings.

AFAIU, when code explicitly calls need_resched() in a loop, there are
two cases:

- need_resched() returns false: This means the loop can continue without
causing long latency on the system. Technically we could restart the
critical timings at this point.

- need_resched() returns true: This means the loop should exit quickly
and call the scheduler. I would not reset the critical timings there,
as whatever code is executed between need_resched() returning true
and calling the scheduler is adding to latency.

Having stop/start critical timings around idle loops seems to just be
an optimization over that.

As for mm and driver/md code, what is wrong with doing a critical
timings reset when need_resched() returns false ? It would prevent
a whole class of false-positives rather than playing whack-a-mole with
those that pop up.

Thanks,

Mathieu

--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

Next message: Andrew Lunn: "Re: [PATCH v4 1/5] dt-bindings: misc: Add mikrobus-connector"
Previous message: Guenter Roeck: "Re: [PATCH v2 2/5] ext4: Add unit test of ext4_mb_generate_buddy"
In reply to: Steven Rostedt: "Re: [RFC][PATCH] tracing: Introduce restart_critical_timings()"
Next in thread: Waiman Long: "Re: [RFC][PATCH] tracing: Introduce restart_critical_timings()"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]