Re: [PATCH 04/10] rv: Add rtapp_block monitor

From: Gabriele Monaco
Date: Wed Mar 12 2025 - 03:34:18 EST


On Tue, 2025-03-11 at 18:05 +0100, Nam Cao wrote:
> Add an RV monitor to detect realtime tasks getting blocked. For the
> full
> description, see Documentation/trace/rv/monitor_rtapp_block.rst.
>
> Signed-off-by: Nam Cao <namcao@xxxxxxxxxxxxx>
> ---
> Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> Cc: Ingo Molnar <mingo@xxxxxxxxxx>
> Cc: Will Deacon <will@xxxxxxxxxx>
> Cc: Boqun Feng <boqun.feng@xxxxxxxxx>
> Cc: Waiman Long <longman@xxxxxxxxxx>
> ---
>  .../trace/rv/monitor_rtapp_block.rst          |  34 +++
>  include/trace/events/lock.h                   |  12 +
>  kernel/locking/rtmutex.c                      |   4 +
>  kernel/trace/rv/Kconfig                       |  12 +-
>  kernel/trace/rv/Makefile                      |   2 +
>  kernel/trace/rv/monitors/rtapp_block/ba.c     | 146 +++++++++++
>  kernel/trace/rv/monitors/rtapp_block/ba.h     | 166 +++++++++++++
>  kernel/trace/rv/monitors/rtapp_block/ltl      |   9 +
>  .../rv/monitors/rtapp_block/rtapp_block.c     | 232
> ++++++++++++++++++

I see the creation of this type of monitor requires a bit more steps,
but are you considering automatic generation of Kconfig and
rtapp_block.c ?
We could reuse (export) some code from dot2k for that since the
skeleton could be the same.

Not needed for this series but it would be very nice to experiment
further.

I see the tracepoint generation is a bit complicated to generalise.
Da monitors currently don't give much data, besides pointing to what
failed in the model. If the user wants more, they can enable the
triggering tracepoints (e.g. I want to see the exact details of a
context switch, I enable that in the trace).

I get it isn't the quickest thing to do but makes the monitors very
general. Do you think something like this could be done here too?
Perhaps storing possible error messages in the monitor's header file
(like the state names in a da monitor).

>  kernel/trace/rv/rv_trace.h                    |  44 ++++
>  lib/Kconfig.debug                             |   3 +
>  11 files changed, 663 insertions(+), 1 deletion(-)
>  create mode 100644 Documentation/trace/rv/monitor_rtapp_block.rst
>  create mode 100644 kernel/trace/rv/monitors/rtapp_block/ba.c
>  create mode 100644 kernel/trace/rv/monitors/rtapp_block/ba.h
>  create mode 100644 kernel/trace/rv/monitors/rtapp_block/ltl
>  create mode 100644
> kernel/trace/rv/monitors/rtapp_block/rtapp_block.c
>
> diff --git a/Documentation/trace/rv/monitor_rtapp_block.rst
> b/Documentation/trace/rv/monitor_rtapp_block.rst
> new file mode 100644
> index 000000000000..9cabbe66fa4a
> --- /dev/null
> +++ b/Documentation/trace/rv/monitor_rtapp_block.rst
> @@ -0,0 +1,34 @@
> +Monitor rtapp_block
> +=======================
> +
> +- Name: rtapp_block - real time applications are undesirably blocked
> +- Type: per-task linear temporal logic monitor
> +- Author: Nam Cao <namcao@xxxxxxxxxxxxx>
> +
> +Introduction
> +------------
> +
> +Real time threads could be blocked and fail to finish their
> execution timely. For instance, they
> +need to access shared resources which are already acquired by other
> threads. Or they could be
> +waiting for non-realtime threads to signal them to proceed: as the
> non-realtime threads are not
> +prioritized by the scheduler, the execution of realtime threads
> could be delayed indefinitely.
> +These scenarios are often unintentional, and cause unexpected
> latency to the realtime application.
> +
> +The rtapp_block monitor reports this type of scenario, by monitoring
> for:
> +
> +  * Realtime threads going to sleep without explicitly asking for it
> (namely, with nanosleep
> +    syscall).
> +  * Realtime threads are woken up by non-realtime threads.
> +
> +How to fix the monitor's warnings?
> +----------------------------------
> +
> +There is no single answer, the solution needs to be evaluated
> depending on the specific cases.
> +
> +If the realtime thread is blocked trying to take a `pthread_mutex_t`
> which is already taken by a
> +non-realtime thread, the solution could be enabling priority
> inheritance for the mutex, so that the
> +blocking non-realtime thread would be priority-boosted to run at
> realtime priority.
> +
> +If realtime thread needs to wait for non-realtime thread to signal
> it to proceed, perhaps the design
> +needs to be reconsidered to remove this dependency. Often, the work
> executed by the realtime thread
> +needs not to be realtime at all.
> diff --git a/include/trace/events/lock.h
> b/include/trace/events/lock.h
> index 8e89baa3775f..d4b32194d47f 100644
> --- a/include/trace/events/lock.h
> +++ b/include/trace/events/lock.h
> @@ -138,6 +138,18 @@ TRACE_EVENT(contention_end,
>   TP_printk("%p (ret=%d)", __entry->lock_addr, __entry->ret)
>  );
>  
> +#ifdef CONFIG_TRACE_RT_MUTEX_WAKE_WAITER
> +DECLARE_TRACE(rt_mutex_wake_waiter_begin,
> + TP_PROTO(struct task_struct *task),
> + TP_ARGS(task))
> +DECLARE_TRACE(rt_mutex_wake_waiter_end,
> + TP_PROTO(struct task_struct *task),
> + TP_ARGS(task))
> +#else
> +#define trace_rt_mutex_wake_waiter_begin(...)
> +#define trace_rt_mutex_wake_waiter_end(...)
> +#endif /* CONFIG_TRACE_RT_MUTEX */
> +
>  #endif /* _TRACE_LOCK_H */
>  
>  /* This part must be outside protection */
> diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c
> index 4a8df1800cbb..fc9cf4a2cf75 100644
> --- a/kernel/locking/rtmutex.c
> +++ b/kernel/locking/rtmutex.c
> [...]
> diff --git a/kernel/trace/rv/Kconfig b/kernel/trace/rv/Kconfig
> index 8226352a0062..d65bf9bda2f2 100644
> --- a/kernel/trace/rv/Kconfig
> +++ b/kernel/trace/rv/Kconfig
> [...]
> diff --git a/kernel/trace/rv/Makefile b/kernel/trace/rv/Makefile
> index 188b64668e1f..6570a3116127 100644
> --- a/kernel/trace/rv/Makefile
> +++ b/kernel/trace/rv/Makefile
> [...]
> diff --git a/kernel/trace/rv/monitors/rtapp_block/ba.c
> b/kernel/trace/rv/monitors/rtapp_block/ba.c
> new file mode 100644
> index 000000000000..5e99f79d5e74
> --- /dev/null
> +++ b/kernel/trace/rv/monitors/rtapp_block/ba.c
> @@ -0,0 +1,146 @@
> [...]
> diff --git a/kernel/trace/rv/monitors/rtapp_block/ba.h
> b/kernel/trace/rv/monitors/rtapp_block/ba.h
> new file mode 100644
> index 000000000000..c1ba88f6779a
> --- /dev/null
> +++ b/kernel/trace/rv/monitors/rtapp_block/ba.h
> @@ -0,0 +1,166 @@
> [...]
> +
> +/**
> + * rv_rtapp_block_atoms_fetch - fetch the atomic propositions
> + * @task: the task
> + * @mon: the LTL monitor
> + *
> + * Must be implemented. This function is called anytime the Buchi
> automaton is triggered. Its
> + * intended purpose is to update the atomic propositions which are
> expensive to trace and can be
> + * easily read from @task. rv_rtapp_block_atom_set() should be used
> to implement this function.
> + *
> + * Using this function may cause incorrect verification result if it
> is important for the LTL that
> + * the atomic propositions must be updated at the correct time.
> Therefore, if it is possible,
> + * updating atomic propositions should be done with
> rv_rtapp_block_atom_update() instead.
> + *
> + * An example where this function is useful is with the LTL
> property:
> + *    always (RT imply not PAGEFAULT)
> + * (a realtime task does not raise page faults)
> + *
> + * In this example, adding tracepoints to track RT is complicated,
> because it is changed in
> + * differrent places (mutex's priority boosting,
> sched_setscheduler). Furthermore, for this LTL
> + * property, we don't care exactly when RT changes, as long as we
> have its correct value when
> + * PAGEFAULT==true. Therefore, it is better to update RT in
> rv_rtapp_block_atoms_fetch(), as it
> + * can easily be retrieved from task_struct.
> + *
> + * This function can be empty.

Personal preference, but what about having the examples only in the
docs and point to those from here? Just to keep the code a bit slimmer.

> + */
> +void rv_rtapp_block_atoms_fetch(struct task_struct *task, struct
> ltl_monitor *mon);
> +
> +/**
> + * rv_rtapp_block_atom_update - update an atomic proposition
> + * @task: the task
> + * @atom: the atomic proposition, one of enum rtapp_block_atom
> + * @value: the new value for @atom
> + *
> + * Update an atomic proposition and trigger the Buchi atomaton to
> check for violation of the LTL
> + * property. This function can be called in tracepoints' handler,
> for example.
> + */
> +void rv_rtapp_block_atom_update(struct task_struct *task, unsigned
> int atom, bool value);
> +
> +/**
> + * rv_rtapp_block_atom_get - get an atomic proposition
> + * @mon: the monitor
> + * @atom: the atomic proposition, one of enum rtapp_block_atom
> + *
> + * Returns the value of an atomic proposition.
> + */
> +static inline
> +enum ltl_truth_value rv_rtapp_block_atom_get(struct ltl_monitor
> *mon, unsigned int atom)
> +{
> + return mon->atoms[atom];
> +}
> +
> +/**
> + * rv_rtapp_block_atom_set - set an atomic proposition
> + * @mon: the monitor
> + * @atom: the atomic proposition, one of enum rtapp_block_atom
> + * @value: the new value for @atom
> + *
> + * Update an atomic proposition without triggering the Buchi
> automaton. This can be useful to
> + * implement rv_rtapp_block_atoms_fetch() and
> rv_rtapp_block_atoms_init().
> + *
> + * Another use case for this function is when multiple atomic
> propositions change at the same time,
> + * because calling rv_rtapp_block_atom_update() (and thus triggering
> the Buchi automaton)
> + * multiple times may be incorrect. In that case,
> rv_rtapp_block_atom_set() can be used to avoid
> + * triggering the Buchi automaton, and rv_rtapp_block_atom_update()
> is only used for the last
> + * atomic proposition.
> + */
> +static inline
> +void rv_rtapp_block_atom_set(struct ltl_monitor *mon, unsigned int
> atom, bool value)
> +{
> + mon->atoms[atom] = value;
> +}
> +
> +/**
> + * rv_rtapp_block_get_data - get the custom data of this monitor.
> + * @mon: the monitor
> + *
> + * If this function is used, rv_rtapp_block_init() must have been
> called with a positive
> + * data_size.
> + */
> +static inline void *rv_rtapp_block_get_data(struct ltl_monitor *mon)
> +{
> + return &mon->data;
> +}
> diff --git a/kernel/trace/rv/monitors/rtapp_block/ltl
> b/kernel/trace/rv/monitors/rtapp_block/ltl
> new file mode 100644
> index 000000000000..781f0144a222
> --- /dev/null
> +++ b/kernel/trace/rv/monitors/rtapp_block/ltl
> @@ -0,0 +1,9 @@
> +RULE = always (WAKEUP_RT_TASK imply (RT or WAKEUP_WHITELIST))
> +   and always ((USER_TASK and RT) imply (SLEEP imply
> INTENTIONAL_SLEEP))
> +
> +INTENTIONAL_SLEEP = DO_NANOSLEEP or FUTEX_LOCK_WITH_PI
> +
> +WAKEUP_WHITELIST = RT_MUTEX_WAKING_WAITER
> +                or STOPPING_WOKEN_TASK
> +                or WOKEN_TASK_IS_MIGRATION
> +                or WOKEN_TASK_IS_RCU
> diff --git a/kernel/trace/rv/monitors/rtapp_block/rtapp_block.c
> b/kernel/trace/rv/monitors/rtapp_block/rtapp_block.c
> new file mode 100644
> index 000000000000..3f5b1efb7af0

I'm wondering if it would be cleaner to keep the specifications under tools/
(just like the dot files for da monitors).
Nevertheless we should be consistent with what we choose.

Thanks,
Gabriele

> --- /dev/null
> +++ b/kernel/trace/rv/monitors/rtapp_block/rtapp_block.c
> [...]
> diff --git a/kernel/trace/rv/rv_trace.h b/kernel/trace/rv/rv_trace.h
> index 96264233cac5..79a7388b5c55 100644
> --- a/kernel/trace/rv/rv_trace.h
> +++ b/kernel/trace/rv/rv_trace.h
> [...]
> diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
> index 1af972a92d06..942318ef3f62 100644
> --- a/lib/Kconfig.debug
> +++ b/lib/Kconfig.debug
> @@ -1638,6 +1638,9 @@ config TRACE_IRQFLAGS
>     Enables hooks to interrupt enabling and disabling for
>     either tracing or lock debugging.
>  
> +config TRACE_RT_MUTEX_WAKE_WAITER
> + bool
> +
>  config TRACE_IRQFLAGS_NMI
>   def_bool y
>   depends on TRACE_IRQFLAGS