Re: [PATCH v3 5/5] cpusets, suspend: Save and restore cpusets duringsuspend/resume

From: Srivatsa S. Bhat
Date: Wed May 16 2012 - 04:43:08 EST


On 05/16/2012 01:50 PM, Srivatsa S. Bhat wrote:

> On 05/16/2012 04:02 AM, David Rientjes wrote:
>
>> On Wed, 16 May 2012, Srivatsa S. Bhat wrote:
>>
>>>> I know root is special
>>>> cased all over the cpuset code, but I think the real fix here is to figure
>>>> out why it can't be left as a superset and then we end up doing nothing
>>>> for s/r.
>>>>
>>>> I don't have a preference for cpu hotplug and whether cpuset.cpus = 1-3
>>>> remains 1-3 when cpu 2 is offlined or not, I think it could be argued both
>>>> ways, but I disagree with saving the cpumask, removing all suspended cpus,
>>>> and then reinstating it for no reason.
>>>>
>>>
>>> I think there is a valid reason behind doing that.
>>>
>>> Cpusets translates to sched domains in scheduler terms. So whenever you update
>>> cpusets, the sched domains are updated. IOW, if you don't touch cpusets during
>>> hotplug (suspend/resume case), you allow them to have offline cpus, meaning,
>>> you allow sched domains to have offline cpus. Hence sched domains are rendered
>>> stale.
>>>
>>
>> It's not possible to update the sched domains for s/r to be a subset of
>> cpuset.cpus?
>
>
> (Btw, the above statement reminds me of a different idea I had long back
> which I will write about in a separate mail.)
>


You suggested keeping sched domains updated during s/r without altering cpuset.cpus.
That is a very good point!

Because, we will then be distinguishing between 2 things:
sched domains can be "stale" because of 2 distinct reasons, one of which is troublesome
but the other is harmless:

1. offline cpus are included in some sched domains, and some offline cpus have
a non-NULL sched domain pointer. This is the problematic situation.

2. sched domains don't reflect the cpuset configurations set up in cpuset.cpus of
different cpusets. This is not really harmful, because if this happens only during
s/r, the userspace wouldn't really notice it, as long as we reinstate the
cpuset<->sched domain dependency properly at the end of resume.

So you are suggesting implementing point #2, where we keep the sched domains updated
(partially, at least in a way that is not harmful), so that we avoid the problem in
#1.

I had written a patch for this long ago:
http://thread.gmane.org/gmane.linux.kernel/1250097/focus=1254715

The idea there was to create a single sched domain at the beginning of suspend,
temporarily ignoring cpuset configurations, and reinstating the proper sched domain
tree taking cpusets into consideration, at the end of resume. That way we need not
touch cpusets during s/r, we need not explicitly save/restore cpusets, and we still
manage to keep the scheduler sane and happy. And the frozen userspace cannot observe
the temporary mismatch between cpusets<->sched domains. So no problems there too.

IMHO, the only reason we didn't finalize on that patch earlier was because the version
in commit 8f2f748b06562 looked much simpler (and at that point, we had no clue that
the latter would lead to suspend hangs).

So, now, we can either go with the design in this v3 (explicit save/restore) or the
one in the link shown above (temporary cpuset<->sched domain mismatch during s/r).

Not sure what Peter has to say about the latter though... He might have reservations
about it, I don't know ;-)

Regards,
Srivatsa S. Bhat

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/