Re: [RFT/PATCH] serial: omap: prevent resume if device is not suspended.

From: Kevin Hilman
Date: Tue Oct 02 2012 - 20:33:14 EST


"Poddar, Sourav" <sourav.poddar@xxxxxx> writes:

> Hi,
>
> On Tue, Sep 25, 2012 at 2:51 PM, Russell King - ARM Linux
> <linux@xxxxxxxxxxxxxxxx> wrote:
>> On Tue, Sep 25, 2012 at 12:11:14PM +0300, Felipe Balbi wrote:
>>> On Tue, Sep 25, 2012 at 10:12:28AM +0100, Russell King - ARM Linux wrote:
>>> > On Tue, Sep 25, 2012 at 11:31:20AM +0300, Felipe Balbi wrote:
>>> > > On Tue, Sep 25, 2012 at 09:30:29AM +0100, Russell King - ARM Linux wrote:
>>> > > > How is this happening? I think that needs proper investigation - or if
>>> > > > it's had more investigation, then the results needs to be included in
>>> > > > the commit description so that everyone can understand the issue here.
>>> > > >
>>> > > > We should not be resuming a device which hasn't been suspended. Maybe
>>> > > > the runtime PM enable sequence is wrong, and that's what should be fixed
>>> > > > instead?
>>> > > >
>>> > > > This sequence in the probe() function:
>>> > > >
>>> > > > pm_runtime_irq_safe(&pdev->dev);
>>> > > > pm_runtime_enable(&pdev->dev);
>>> > > > pm_runtime_get_sync(&pdev->dev);
>>> > > >
>>> > > > would enable runtime PM while the s/w state indicates that it's disabled,
>>> > > > and then that pm_runtime_get_sync() will want to resume the device. See
>>> > > > the section "5. Runtime PM Initialization, Device Probing and Removal"
>>> > > > in Documentation/power/runtime_pm.txt, specifically the second paragraph
>>> > > > of that section.
>>> > >
>>> > > that was tested. It worked in pandaboard but didn't work on beagleboard
>>> > > XM. Sourav tried to start a discussion about that, but it simply died...
>>> > >
>>> > > In any case, pm_runtime_get_sync() in probe will always call
>>> > > runtime_resume callback, right ?
>>> >
>>> > Well, if the runtime PM state says it's suspended, and then you enable
>>> > runtime PM, the first call to pm_runtime_get_sync() will trigger a resume
>>> > attempt. The patch description is complaining about resume events without
>>> > there being a preceding suspend event.
>>> >
>>> > This could well be why.
>>>
>>> that's most likely, of course. But should we cause a regression to
>>> beagleboard XM because of that ?
>>
>> What would cause a regression on beagleboard XM? I have not suggested
>> any change other than more investigation of the issue and a fuller patch
>> description - yet you're screaming (idiotically IMHO) that mere
>> investigation would break beagleboard.
>>
>> Well, if it's _that_ fragile, that mere investigation of this issue by
>> someone elsewhere on the planet would break your beagleboard, maybe it
>> deserves to be broken!
>
> The issue was observed at serial init itself in the N800 board and the
> log does not
> show up much.
> http://www.pwsan.com/omap/testlogs/test_tty_next_e36851d0/20120910020323/boot/2420n800/2420n800_log.txt
> What we thought the problem might be with n800 is that it tries to
> resume when it didn't suspend before.
>
> There are two ways through which we thought of handling this issue:
>
> a) set device as active before enabling pm (which will prevent
>
> pm_runtime_set_active(dev);
> pm_runtime_enable(dev);
>
> OR
>
> b) adding a "suspended" flag to struct omap_uart_port which gets set on
> suspend and cleared on resume. Then on resume you can check:
>
> if (!up->suspended)
> return 0;
>
> But using "pm_runtime_set_active" approach breaks things even on
> beagle board xm, though
> it works fine on Panda.
> Therefore, we used the "suspended" flag approach.
>
> So. I just wanted to get some feedback from community about how using
> "pm_runtime_set_active"
> behaves differently in omap3 and omap4.

As Russell has already pointed out in great detail, the difference is
simply a mismatch between assumed HW stated and actual hardware state
between various boards. Put simply, the driver assumes the HW is
disabled (runtime suspended) when it loads, and the first runtime resume
is meant to enable the HW. If that assumption is wrong, it needs to be
fixed.

Have you figured out why the HW is already active on OMAP2? (probably
bootloader?)

That being said, already active HW should not cause this problem. In
fact, because of possible early console use, the hwmod init of the UART
hwmods does not idle/reset them on boot, so they are left in the state
that the bootloader set them in.

When the hwmod is later enabled for real during probe, the hwmod muxing
is done for that IP. So, I suspect what is really happening is that the
mux settings are not right for the UARTS on n800, so when the probe
happens, the UART mux settings are changed and you lose the UART.

Can you double check the UART mux settings for that board? You might
need some different mux settings in the board file.

Kevin

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/