Re: [PATCH] wifi: mwifiex: decrease timeout waiting for host sleep from 10s to 5s
From: Pin-yen Lin
Date: Thu Dec 05 2024 - 08:49:23 EST
Hi Doug,
On Thu, Dec 5, 2024 at 2:11 AM Doug Anderson <dianders@xxxxxxxxxxxx> wrote:
>
> Hi,
>
> On Wed, Dec 4, 2024 at 5:45 AM Pin-yen Lin <treapking@xxxxxxxxxxxx> wrote:
> >
> > > > > Can you try testing (and gather timing numbers) when
> > > > > suspending soon after initiating scans? It's hard to judge what the
> > > > > lower limit of this timeout should really be without any numbers, just
> > > > > like it's hard to judge whether your 10 second watchdog is reasonable.
> > > >
> > > > Pin-yen: is this something you could gather?
> >
> > I tried entering suspend right after wifi scans, and the time spent in
> > mwifiex_enable_hs() is always around 100ms. It seems initiating
> > suspend does not increase the execution time for mwifiex_enable_hs(),
> > so I think the driver is capable of interrupting a scan.
> > > >
> > > >
> > > > > Also, for the record, since we might have to field regression reports
> > > > > for other systems: what hardware (chip variant, FW version) are you
> > > > > seeing problems on?
> > > >
> > > > Pin-yen: I'm assuming you'll provide this.
> >
> > From the debugfs entry:
> >
> > driver_name = "mwifiex"
> > driver_version = mwifiex 1.0 (15.68.19.p54)
> > verext = w8897o-B0, RF87XX, FP68, 15.68.19.p54
> >
> > The compatible string of the DT is "marvell,sd8897".
> > >
> > > I'll leave it up to y'all (Doug and Pin-Yen) whether you want to provide
> > > the above to provide a little more confidence, or if you want to
> > > reconsider your use of CONFIG_DPM_WATCHDOG_TIMEOUT.
> >
> > It looks okay to me to decrease the timeout here given that scanning
> > doesn't seem to affect the suspend time. What's your thought, Doug?
>
> I think Brian is right and that we should change how we're using the
> DPM watchdog, but IMO that doesn't mean that we couldn't also change
> this timeout.
>
> If nothing else, you'd want to post a v2 of your patch containing a
> summary of the info you've gathered so it's recorded with the patch.
>
> Maybe what makes the most sense, though, is to put our money where our
> mouth is and land some sort of patch in the ChromeOS tree and then
> report back to upstream in a month when we have data about it. If
> things look good in the field then presumably that would give some
> confidence for upstream to be willing to land the patch?
>
> Probably about the best data we could gather would be to break the
> existing timeout into two halves. You could wait half the time, then
> print a warning, and then wait the other half the time. We could even
> land that change _without_ changing the timeout to 5 seconds. Then if
> we ever see the warning print but then the suspend succeeds we know
> that there are cases where waiting longer would have helped. If we
> made that a WARN_ON() our existing infrastructure would help us gather
> that info...
I just realized that mwifiex_wait_queue_complete() has another 12s
timeout[1], though they are not directly involved in suspend/resume.
Should we also add a warning to that and see if we can make it half?
This starts to make me think how many timeouts we want to consider.
Does it make sense to only focus on the one in mwifiex_enable_hs()? Or
should we check other timeouts in mwifiex or even other drivers?
[1]: https://elixir.bootlin.com/linux/v6.12.1/source/drivers/net/wireless/marvell/mwifiex/sta_ioctl.c#L51
>
> ...so I guess summarizing my rambling, a good plan would be:
>
> 1. Change the ChromeOS DPM watchdog timeout to 15 seconds to avoid the
> problem for now.
>
> 2. Land a patch to do a WARN_ON() in mwifiex if we take 5 seconds.
> Actually, you could (probably) send this upstream and land it
> FROMLIST?
>
> 3. Wait ~1-2 months and see if we ever see the WARN_ON() trigger but
> then things succeed after 10 seconds. If this never happens then send
> a v2 patch changing the timeout to 5 seconds with details in the
> commit message.
>
> 4. In the background, see if we can convince folks to make the DPM
> Watchdog less panicky.
>
>
> -Doug
Regards,
Pin-yen