Re: [PATCH] USB:ohci:fix ohci interruption problem

From: liulongfang
Date: Tue Apr 06 2021 - 09:12:52 EST


2021/4/2 23:26, Alan Stern 写道:
> On Fri, Apr 02, 2021 at 05:27:59PM +0800, Longfang Liu wrote:
>> The operating method of the system entering S4 sleep mode:
>> echo disk > /sys/power/state
>
> This discussion is still not right.
>
The operating method is:
echo reboot > /sys/power/disk
echo disk > /sys/power/state

>> When OHCI enters the S4 sleep state,
>
> To start with, you should be talking about hibernation (also known as
> suspend-to-disk), not S4. When the system enters hibernation -- for
> example, when you write "disk" to /sys/power/state -- the controller may
> go into S4 or it may go into some other power-saving state.
>
>> the USB sleep process will call
>> check_root_hub_suspend() and ohci_bus_suspend() instead of
>> ohci_suspend() and ohci_bus_suspend(), this causes the OHCI interrupt
>> to not be closed.
>
> This isn't true. The procedure _does_ call ohci_suspend, through the
> .poweroff callback in hcd-pci.c. That callback goes to the
> hcd_pci_suspend routine, which calls suspend_common and then
> ohci_suspend.
>
> However, these calls happen after the kernel image has be written to the
> storage area on the disk. As a result, any log messages produced during
> the calls do not get saved, so they don't get reloaded when the system
> resumes from hibernation, and they aren't present in the log after the
> system wakes up. That's why they didn't appear in the log you included
> in an earlier email. The only way to observe them is to use a remote > console, such as a network console.
>After adding dump_stack to ohci_suspend, do hibernation test,
.poweroff is not called, but .freeze is called, and these logs are
presented in dmesg:
[root@localhost power]# echo reboot > disk
[root@localhost power]# echo disk > state
[ 1883.631163] PM: hibernation: hibernation entry
[ 1883.701199] Filesystems sync: 0.058 seconds
[ 1883.705443] Freezing user space processes ... (elapsed 0.004 seconds) done.
[ 1883.717094] OOM killer disabled.
[ 1883.730258] PM: hibernation: Preallocating image memory
[ 1889.162453] PM: hibernation: Allocated 1020044 pages for snapshot
[ 1889.168564] PM: hibernation: Allocated 4080176 kbytes in 5.42 seconds (752.80 MB/s)
[ 1889.176215] Freezing remaining freezable tasks ... (elapsed 0.099 seconds) done.
[ 1889.285477] printk: Suspending console(s) (use no_console_suspend to debug)
...
[ 1889.325720] Call trace:
[ 1889.325734] dump_backtrace+0x0/0x1e0
[ 1889.325742] show_stack+0x2c/0x48
[ 1889.325766] dump_stack+0xcc/0x104
[ 1889.325789] ohci_suspend+0x38/0xd8 [ohci_hcd]
[ 1889.325823] suspend_common+0xe0/0x160
[ 1889.325835] hcd_pci_freeze+0x38/0x48
[ 1889.325853] pci_pm_freeze+0x68/0x110
[ 1889.325881] dpm_run_callback+0x4c/0x230
[ 1889.325891] __device_suspend+0x108/0x4d8
[ 1889.325900] async_suspend+0x34/0xb8
[ 1889.325907] async_run_entry_fn+0x4c/0x118
[ 1889.325919] process_one_work+0x1f0/0x4a0
[ 1889.325926] worker_thread+0x48/0x460
[ 1889.325936] kthread+0x160/0x168
[ 1889.325947] ret_from_fork+0x10/0x18
...
[ 1895.297836] Call trace:
[ 1895.297846] dump_backtrace+0x0/0x1e0
[ 1895.297880] show_stack+0x2c/0x48
[ 1895.297925] dump_stack+0xcc/0x104
[ 1895.297973] usb usb3: root hub lost power or was reset
[ 1895.297997] ohci_resume+0x50/0x1a0 [ohci_hcd]
[ 1895.298057] resume_common+0xa0/0x120
[ 1895.298071] hcd_pci_restore+0x24/0x30
[ 1895.298084] pci_pm_restore+0x64/0xb0
[ 1895.298101] dpm_run_callback+0x4c/0x230
[ 1895.298113] device_resume+0xdc/0x1c8
[ 1895.298125] async_resume+0x30/0x60
[ 1895.298132] async_run_entry_fn+0x4c/0x118
[ 1895.298141] process_one_work+0x1f0/0x4a0
[ 1895.298148] worker_thread+0x48/0x460
[ 1895.298159] kthread+0x160/0x168
[ 1895.298171] ret_from_fork+0x10/0x18
...
[ 1900.939779] OOM killer enabled.
[ 1900.942930] Restarting tasks ... done.
[ 1900.962630] PM: hibernation: hibernation exit

> In fact, that's pretty much the only way to debug problems that occur
> during a hibernation transition.
>
>> At this time, if just one device interrupt is reported. Since rh_state
>> has been changed to OHCI_RH_SUSPENDED after ohci_bus_suspend(), the
>> driver will not process and close this device interrupt.
>
> That's not true either. The ohci_irq routine does indeed process
> interrupts even when rh_state is set to OHCI_RH_SUSPENDED. How else
> could it handle a device's wakeup request?
>
>> It will cause
>> the entire system to be stuck during sleep, causing the device to
>> fail to respond.
>
> During hibernation, the system is powered off. Obviously the kernel is
> not capable of handling interrupts at this time.
>
> Also, why would a device interrupt be reported at this time? What
> causes the interrupt request?
>
>> When the abnormal interruption reaches 100,000 times, the system will
>> forcibly close the interruption and make the device unusable.
>>
>> Because the root cause of the problem is that ohci_suspend is not
>> called to perform normal interrupt shutdown operations when the system
>> enters S4 sleep mode.
>>
>> Therefore, our solution is to specify freeze interface in this mode to
>> perform normal suspend_common() operations, and call ohci_suspend()
>> after check_root_hub_suspend() is executed through the suspend_common()
>> operation.
>
> No. The freeze interface does not need to power-down the controller.
> All it needs to do is make sure that no communication between the
> computer and the attached USB devices takes place, and this is handled
> by ohci_bus_suspend.
>
> Furthermore, it is a mistake for the freeze routine to change anything
> unless the thaw routine reverses the change. Your patch leaves the thaw
> callback pointer set to NULL.
>
>> After using this solution, it is verified by the stress test of sleep
>> wake up in S4 mode for a long time that this problem no longer occurs.
>
> Something else must be happeneing, something you don't understand.
>
> Alan Stern
>
>> Signed-off-by: Longfang Liu <liulongfang@xxxxxxxxxx>
>> ---
>> drivers/usb/core/hcd-pci.c | 7 ++++++-
>> 1 file changed, 6 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/usb/core/hcd-pci.c b/drivers/usb/core/hcd-pci.c
>> index 1547aa6..c5844a3 100644
>> --- a/drivers/usb/core/hcd-pci.c
>> +++ b/drivers/usb/core/hcd-pci.c
>> @@ -509,6 +509,11 @@ static int resume_common(struct device *dev, int event)
>>
>> #ifdef CONFIG_PM_SLEEP
>>
>> +static int hcd_pci_freeze(struct device *dev)
>> +{
>> + return suspend_common(dev, device_may_wakeup(dev));
>> +}
>> +
>> static int hcd_pci_suspend(struct device *dev)
>> {
>> return suspend_common(dev, device_may_wakeup(dev));
>> @@ -605,7 +610,7 @@ const struct dev_pm_ops usb_hcd_pci_pm_ops = {
>> .suspend_noirq = hcd_pci_suspend_noirq,
>> .resume_noirq = hcd_pci_resume_noirq,
>> .resume = hcd_pci_resume,
>> - .freeze = check_root_hub_suspended,
>> + .freeze = hcd_pci_freeze,
>> .freeze_noirq = check_root_hub_suspended,
>> .thaw_noirq = NULL,
>> .thaw = NULL,
>> --
>> 2.8.1
>>
> .
>
Thanks,
Longfang.