Re: [PATCH RESEND] Do not mark ACPI devices as irq safe
From: Breno Leitao
Date: Tue Aug 13 2024 - 09:33:18 EST
Hello Andy,
On Fri, Aug 09, 2024 at 02:03:27PM +0300, Andy Shevchenko wrote:
> On Fri, Aug 9, 2024 at 2:57 AM Andi Shyti <andi.shyti@xxxxxxxxxx> wrote:
> > On Thu, Aug 08, 2024 at 05:14:46AM GMT, Breno Leitao wrote:
> > > The problem arises because during __pm_runtime_resume(), the spinlock
> > > &dev->power.lock is acquired before rpm_resume() is called. Later,
> > > rpm_resume() invokes acpi_subsys_runtime_resume(), which relies on
> > > mutexes, triggering the error.
> > >
> > > To address this issue, devices on ACPI are now marked as not IRQ-safe,
> > > considering the dependency of acpi_subsys_runtime_resume() on mutexes.
>
> This is a step in the right direction
Thanks
> but somewhere in the replies
> here I would like to hear about roadmap to get rid of the
> pm_runtime_irq_safe() in all Tegra related code.
Agree, that seems the right way to go, but this is a question to
maintainers, Laxman and Dmitry.
By the way, looking at lore, I found that the last email from Laxman is
from 2022. And Dmitry seems to be using a different email!? Let me copy
the Dmitry's other email (dmitry.osipenko@xxxxxxxxxxxxx) here.
> > > + if (!IS_VI(i2c_dev) && !ACPI_HANDLE(i2c_dev->dev))
> >
> > looks good to me, can I have an ack from Andy here?
>
> I prefer to see something like
> is_acpi_node() / is_acpi_device_node() / is_acpi_data_node() /
> has_acpi_companion()
> instead depending on the actual ACPI representation of the device.
>
> Otherwise no objections.
> Please, Cc me (andy@xxxxxxxxxx) for the next version.
Thanks for the feedback, I agree that leveraging the functions about
should be better. What about something as:
Author: Breno Leitao <leitao@xxxxxxxxxx>
Date: Thu Jun 6 06:27:07 2024 -0700
Do not mark ACPI devices as irq safe
On ACPI machines, the tegra i2c module encounters an issue due to a
mutex being called inside a spinlock. This leads to the following bug:
BUG: sleeping function called from invalid context at kernel/locking/mutex.c:585
in_atomic(): 0, irqs_disabled(): 1, non_block: 0, pid: 1282, name: kssif0010
preempt_count: 0, expected: 0
RCU nest depth: 0, expected: 0
irq event stamp: 0
Call trace:
__might_sleep
__mutex_lock_common
mutex_lock_nested
acpi_subsys_runtime_resume
rpm_resume
tegra_i2c_xfer
The problem arises because during __pm_runtime_resume(), the spinlock
&dev->power.lock is acquired before rpm_resume() is called. Later,
rpm_resume() invokes acpi_subsys_runtime_resume(), which relies on
mutexes, triggering the error.
To address this issue, devices on ACPI are now marked as not IRQ-safe,
considering the dependency of acpi_subsys_runtime_resume() on mutexes.
Co-developed-by: Michael van der Westhuizen <rmikey@xxxxxxxx>
Signed-off-by: Michael van der Westhuizen <rmikey@xxxxxxxx>
Signed-off-by: Breno Leitao <leitao@xxxxxxxxxx>
diff --git a/drivers/i2c/busses/i2c-tegra.c b/drivers/i2c/busses/i2c-tegra.c
index 85b31edc558d..1df5b4204142 100644
--- a/drivers/i2c/busses/i2c-tegra.c
+++ b/drivers/i2c/busses/i2c-tegra.c
@@ -1802,9 +1802,9 @@ static int tegra_i2c_probe(struct platform_device *pdev)
* domain.
*
* VI I2C device shouldn't be marked as IRQ-safe because VI I2C won't
- * be used for atomic transfers.
+ * be used for atomic transfers. ACPI device is not IRQ safe also.
*/
- if (!IS_VI(i2c_dev))
+ if (!IS_VI(i2c_dev) && !has_acpi_companion(i2c_dev->dev))
pm_runtime_irq_safe(i2c_dev->dev);
pm_runtime_enable(i2c_dev->dev);