Re: PNP0501 serial driver takes almost 2 seconds to suspend/resume (printk issue)

From: Todd Brandt
Date: Wed Jul 13 2022 - 14:24:03 EST


URGENT: Removing the commit FIXES the issue.

I just ran a 5.19.0-rc6 kernel with the offending commit removed and it
fixed the problem completely on all 3 machines. To be clear the
offending commit is:

commit 3b604ca81202eea2a917eb6491e90f610fba0ec7

I strongly recommend that this commit be pulled (or fixed very quickly)
before the 5.19 release or 1 in 10 linux machines running 5.19 will
take up to 2 seconds longer in suspend/resume.

commit 3b604ca81202eea2a917eb6491e90f610fba0ec7
Author: John Ogness <john.ogness@xxxxxxxxxxxxx>
Date: Thu Apr 21 23:28:46 2022 +0206

printk: add pr_flush()

Provide a might-sleep function to allow waiting for console
printers
to catch up to the latest logged message.

Use pr_flush() whenever it is desirable to get buffered messages
printed before continuing: suspend_console(), resume_console(),
console_stop(), console_start(), console_unblank().

Signed-off-by: John Ogness <john.ogness@xxxxxxxxxxxxx>
Reviewed-by: Petr Mladek <pmladek@xxxxxxxx>
Signed-off-by: Petr Mladek <pmladek@xxxxxxxx>
Link:
https://lore.kernel.org/r/20220421212250.565456-12-john.ogness@xxxxxxxxxxxxx

include/linux/printk.h | 7 +++++
kernel/printk/printk.c | 83
++++++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 90 insertions(+)



On Wed, 2022-07-13 at 10:11 -0700, Todd Brandt wrote:
> I've updated the bugzilla entry with new data:
> https://bugzilla.kernel.org/show_bug.cgi?id=216216
>
> I just added 3 new tests for 5.19.0-rc6 on 3 machines that see this
> issue: otcpl-asus-e200-cht (cherry trail), otcpl-aml-y (amber lake),
> and otcpl-whl-u (whiskey lake). The kernel has the
> CONFIG_PRINTK_CALLER
> option enabled.
>
> The test is a S2idle and is run thusly:
> %> sleepgraph -dev -m freeze -rtcwake 15
>
> I've included the dmesg boot logs for all three. The dmesg
> suspend/resume logs are included in the html timelines by clicking
> the
> "dmesg" button in the upper right hand corner of the timeline.
> There's
> a "log" button as well that shows other system into.
>
> These files are attached to the bugzilla entry.
> otcpl-aml-y-5.19.0-rc6-boot-dmesg.txt
> otcpl-aml-y-5.19.0-rc6-freeze.html
> otcpl-asus-e200-cht-5.19.0-rc6-boot-dmesg.txt
> otcpl-asus-e200-cht-5.19.0-rc6-freeze.html
> otcpl-whl-u-5.19.0-rc6-boot-dmesg.txt
> otcpl-whl-u-5.19.0-rc6-freeze.html
>
> If possible can we move this thread to the bugzilla comment section?
>
>
> On Wed, 2022-07-13 at 11:57 +0206, John Ogness wrote:
> > On 2022-07-11, Sergey Senozhatsky <senozhatsky@xxxxxxxxxxxx> wrote:
> > > > It seems that __pr_flush() does not check whether all consoles
> > > > are
> > > > suspended. In this case the progress is not possible and it has
> > > > to
> > > > wait the entire timeout.
> > >
> > > But isn't console_suspended set after pr_flush() call?
> >
> > There should not be any printing after the suspend_console()
> > message.
> > If
> > Todd's report is coming from 5.19-rc1, then it is likely a kthread
> > issue, where the kthread is not respecting @console_suspended.
> > (This
> > would still need to be fixed for the kthreads, but would not be
> > relevant
> > for 5.19.)
> >
> > John