Re: On trace_*_rcuidle functions in modules
From: John Stultz
Date: Wed Apr 15 2020 - 22:18:13 EST
On Wed, Apr 15, 2020 at 5:48 PM Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:
>
> On Wed, 15 Apr 2020 17:06:24 -0700
> John Stultz <john.stultz@xxxxxxxxxx> wrote:
>
> > So you're saying the recent change to move to using trace_*_rcuidle()
> > was unnecessary?
> >
> > Or is there a different notifier then cpu_pm_register_notifier() that
> > the driver should be using (that one seems to be using
> > atomic_notifier_chain_register())?
>
> From looking at the trace event in __tcs_buffer_write() in
> drivers/soc/qcom/rpmh-rsc.c, the _rcuidle() was added by:
>
> efde2659b0fe8 ("drivers: qcom: rpmh-rsc: Use rcuidle tracepoints for rpmh")
>
> Which shows a backtrace dump of:
>
> Call trace:
> dump_backtrace+0x0/0x174
> show_stack+0x20/0x2c
> dump_stack+0xc8/0x124
> lockdep_rcu_suspicious+0xe4/0x104
> __tcs_buffer_write+0x230/0x2d0
> rpmh_rsc_write_ctrl_data+0x210/0x270
> rpmh_flush+0x84/0x24c
> rpmh_domain_power_off+0x78/0x98
> _genpd_power_off+0x40/0xc0
> genpd_power_off+0x168/0x208
> genpd_power_off+0x1e0/0x208
> genpd_power_off+0x1e0/0x208
> genpd_runtime_suspend+0x1ac/0x220
> __rpm_callback+0x70/0xfc
> rpm_callback+0x34/0x8c
> rpm_suspend+0x218/0x4a4
> __pm_runtime_suspend+0x88/0xac
> psci_enter_domain_idle_state+0x3c/0xb4
> cpuidle_enter_state+0xb8/0x284
> cpuidle_enter+0x38/0x4c
> call_cpuidle+0x3c/0x68
> do_idle+0x194/0x260
> cpu_startup_entry+0x24/0x28
> secondary_start_kernel+0x150/0x15c
>
>
> There's no notifier that calls this. This is called by the rpm_callback
> logic. Perhaps that callback will require a call to rcu_irq_enter()
> before calling the callback.
Yea. Sorry, its extra confusing as the call stack there includes
patches who's equivalents are only now in -next (I myself managed to
confuse what was upstream vs in -next in this thread and suddenly
couldn't find the code I had described - apologies).
See:
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/drivers/soc/qcom/rpmh-rsc.c?h=next-20200415#n795
> In any case, I think it is wrong that these callbacks are called
> without RCU watching. The _rcuidle() on that tracepoint should be
> removed, and we fix the code that gets there to ensure that RCU is
> enabled. I agree with Peter, that no module code should be executed
> without RCU watching.
For sanity sake, it seems like the rule should be we avoid driver code
executing without RCU watching. The fact of if it's a loadable module
or not is super subtle, and likely that more folks will trip over it.
But ok. I'll follow around to understand if the commit efde2659b0fe8
("drivers: qcom: rpmh-rsc: Use rcuidle tracepoints for rpmh") is
actually necessary and see what would be needed to revert it.
thanks
-john