Re: [PATCH v3 0/5] PM: sleep: Improvements of async suspend and resume of devices

From: Rafael J. Wysocki
Date: Sat Mar 15 2025 - 10:58:02 EST


On Fri, Mar 14, 2025 at 10:06 PM Saravana Kannan <saravanak@xxxxxxxxxx> wrote:
>
> On Fri, Mar 14, 2025 at 6:24 AM Rafael J. Wysocki <rjw@xxxxxxxxxxxxx> wrote:
> >
> > Hi Everyone,
> >
> > This is a new iteration of the async suspend/resume improvements work:
> >
> > https://lore.kernel.org/linux-pm/1915694.tdWV9SEqCh@xxxxxxxxxxxxx/
> >
> > which includes some rework and fixes of the patches in the series linked
> > above. The most significant differences are splitting the second patch
> > into two patches and adding a change to treat consumers like children
> > during resume.
> >
> > This new iteration is based on linux-pm.git/linux-next and on the recent
> > fix related to direct-complete:
> >
> > https://lore.kernel.org/linux-pm/12627587.O9o76ZdvQC@xxxxxxxxxxxxx/
> >
> > The overall idea is still to start async processing for devices that have
> > at least some dependencies met, but not necessarily all of them, to avoid
> > overhead related to queuing too many async work items that will have to
> > wait for the processing of other devices before they can make progress.
> >
> > Patch [1/5] does this in all resume phases, but it just takes children
> > into account (that is, async processing is started upfront for devices
> > without parents and then, after resuming each device, it is started for
> > the device's children).
> >
> > Patches [2/5] does this in the suspend phase of system suspend and only
> > takes parents into account (that is, async processing is started upfront
> > for devices without any children and then, after suspending each device,
> > it is started for the device's parent).
> >
> > Patch [3/5] extends it to the "late" and "noirq" suspend phases.
> >
> > Patch [4/5] adds changes to treat suppliers like parents during suspend.
> > That is, async processing is started upfront for devices without any
> > children or consumers and then, after suspending each device, it is
> > started for the device's parent and suppliers.
> >
> > Patch [5/5] adds changes to treat consumers like children during resume.
> > That is, async processing is started upfront for devices without a parent
> > or any suppliers and then, after resuming each device, it is started for
> > the device's children and consumers.
> >
> > Preliminary test results from one sample system are below.
> >
> > "Baseline" is the linux-pm.git/testing branch, "Parent/child"
> > is that branch with patches [1-3/5] applied and "Device links"
> > is that branch with patches [1-5/5] applied.
> >
> > "s/r" means "regular" suspend/resume, noRPM is "late" suspend
> > and "early" resume, and noIRQ means the "noirq" phases of
> > suspend and resume, respectively. The numbers are suspend
> > and resume times for each phase, in milliseconds.
> >
> > Baseline Parent/child Device links
> >
> > Suspend Resume Suspend Resume Suspend Resume
> >
> > s/r 427 449 298 450 294 442
> > noRPM 13 1 13 1 13 1
> > noIRQ 31 25 28 24 28 26
> >
> > s/r 408 442 298 443 301 447
> > noRPM 13 1 13 1 13 1
> > noIRQ 32 25 30 25 28 25
> >
> > s/r 408 444 310 450 298 439
> > noRPM 13 1 13 1 13 1
> > noIRQ 31 24 31 26 31 24
> >
> > It clearly shows an improvement in the suspend path after
> > applying patches [1-3/5], easily attributable to patch [2/5],
> > and clear difference after updating the async processing of
> > suppliers and consumers.

A "no" is missing above, it should be "and no clear difference after
updating ...".

Also, please find attached a text file with sample results from 3
different systems (including the one above), not for drawing any
conclusions (the number of samples is too low), but to illustrate what
can happen.

While both Dell XPS13 systems show a consistent improvement after
applying the first three patches, everything else is essentially a
wash (particularly on the desktop machine that seems to suspend and
resume as fast as it gets already).

> >
> > Note that there are systems where resume times are shorter after
> > patches [1-3/5] too, but more testing is necessary.
> >
> > I do realize that this code can be optimized further, but it is not
> > particularly clear to me that any further optimizations would make
> > a significant difference and the changes in this series are deep
> > enough to do in one go.
>
> Thanks for adding patches 4 and 5!

No problem.

> Let me try to test them early next week and compare your patches 1-3,
> 1-5 and my series (which does additional checks to make sure
> suppliers/consumers are done). I do about 100 suspend/resume runs for
> each kernel, so please bear with me while I get it.

Thanks and no worries, please take as much time as needed. I will be
traveling next week, so I'll be a bit slow to respond anyway.

Since I've got a confirmation from internal testing (carried out on a
much wider range of machines and much more extensively that I can do
it myself) that patches [1-3/5] are overall improvement, I'm planning
to queue them up during the 6.16 cycle and other improvements can be
done on top of them, including patches [4-5/5]. I also think that
adding explicit status tracking (if it turns out to make things faster
measurably with respect to this series) on top of patches [4-5/5]
would be rather straightforward.
Baseline Parents/children Device links

Suspend Resume Suspend Resume Suspend Resume

Dell XPS13 9360

s/r 427 449 298 450 294 442
noRPM 13 1 13 1 13 1
noIRQ 31 25 28 24 28 26

s/r 408 442 298 443 301 447
noRPM 13 1 13 1 13 1
noIRQ 32 25 30 25 28 25

s/r 408 444 310 450 298 439
noRPM 13 1 13 1 13 1
noIRQ 31 24 31 26 31 24

Dell XPS13 9380

s/r 439 283 318 290 319 290
noRPM 15 2 15 1 15 2
noIRQ 198 1766 202 1743 204 1766

s/r 439 281 318 280 320 280
noRPM 15 2 15 1 15 1
noIRQ 199 1781 203 1783 205 1770

s/r 440 279 319 281 320 283
noRPM 14 2 15 1 15 1
noIRQ 197 1777 202 1765 203 1724

Coffee Lake Desktop

s/r 138 347 130 345 132 344
noRPM 15 2 20 2 15 2
noIRQ 15 25 23 25 16 26

s/r 133 345 124 343 131 346
noRPM 14 1 13 1 13 1
noIRQ 15 25 14 25 14 25

s/r 124 343 126 345 128 345
noRPM 13 1 13 1 13 1
noIRQ 14 25 14 25 14 26