[tip: timers/core] powerpc/rtas: Use fsleep() to minimize additional sleep duration

From: tip-bot2 for Anna-Maria Behnsen
Date: Tue Oct 15 2024 - 18:41:44 EST


The following commit has been merged into the timers/core branch of tip:

Commit-ID: b7f0eb8c9bc8662ca78082e82856fcb0cf16d7c6
Gitweb: https://git.kernel.org/tip/b7f0eb8c9bc8662ca78082e82856fcb0cf16d7c6
Author: Anna-Maria Behnsen <anna-maria@xxxxxxxxxxxxx>
AuthorDate: Mon, 14 Oct 2024 10:22:30 +02:00
Committer: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
CommitterDate: Wed, 16 Oct 2024 00:36:48 +02:00

powerpc/rtas: Use fsleep() to minimize additional sleep duration

When commit 38f7b7067dae ("powerpc/rtas: rtas_busy_delay() improvements")
was introduced, documentation about proper usage of sleep related functions
was outdated.

The commit message references the usage of a HZ=100 system. When using a
20ms sleep duration on such a system and therefore using msleep(), the
possible additional slack will be +10ms.

When the system is configured with HZ=100 the granularity of a jiffy and of
a bucket of the lowest timer wheel level is 10ms. To make sure a timer will
not expire early (when queueing of the timer races with an concurrent
update of jiffies), timers are always queued into the next bucket. This is
the reason for the maximal possible slack of 10ms.

fsleep() limits the maximal possible slack to 25% by making threshold
between usleep_range() and msleep() HZ dependent. As soon as the accuracy
of msleep() is sufficient, the less expensive timer list timer based
sleeping function is used instead of the more expensive hrtimer based
usleep_range() function. The udelay() will not be used in this specific
usecase as the lowest sleep length is larger than 1 millisecond.

Use fsleep() directly instead of using an own heuristic for the best
sleeping mechanism to use.

Signed-off-by: Anna-Maria Behnsen <anna-maria@xxxxxxxxxxxxx>
Signed-off-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Reviewed-by: Frederic Weisbecker <frederic@xxxxxxxxxx>
Acked-by: Michael Ellerman <mpe@xxxxxxxxxxxxxx> (powerpc)
Link: https://lore.kernel.org/all/20241014-devel-anna-maria-b4-timers-flseep-v3-13-dc8b907cb62f@xxxxxxxxxxxxx

---
arch/powerpc/kernel/rtas.c | 21 +++++++--------------
1 file changed, 7 insertions(+), 14 deletions(-)

diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c
index f7e86e0..d31c979 100644
--- a/arch/powerpc/kernel/rtas.c
+++ b/arch/powerpc/kernel/rtas.c
@@ -1390,21 +1390,14 @@ bool __ref rtas_busy_delay(int status)
*/
ms = clamp(ms, 1U, 1000U);
/*
- * The delay hint is an order-of-magnitude suggestion, not
- * a minimum. It is fine, possibly even advantageous, for
- * us to pause for less time than hinted. For small values,
- * use usleep_range() to ensure we don't sleep much longer
- * than actually needed.
- *
- * See Documentation/timers/timers-howto.rst for
- * explanation of the threshold used here. In effect we use
- * usleep_range() for 9900 and 9901, msleep() for
- * 9902-9905.
+ * The delay hint is an order-of-magnitude suggestion, not a
+ * minimum. It is fine, possibly even advantageous, for us to
+ * pause for less time than hinted. To make sure pause time will
+ * not be way longer than requested independent of HZ
+ * configuration, use fsleep(). See fsleep() for details of
+ * used sleeping functions.
*/
- if (ms <= 20)
- usleep_range(ms * 100, ms * 1000);
- else
- msleep(ms);
+ fsleep(ms * 1000);
break;
case RTAS_BUSY:
ret = true;