On 28/07/22 11:39, Tejun Heo wrote:
Hello, Waiman.FWIW on a runtime overhead side of things I think it'll be OK as that
On Thu, Jul 28, 2022 at 05:04:19PM -0400, Waiman Long wrote:
Yeah, we'd need to track what user requested separately from the currentlySo, the patch you proposed is making the code remember one special aspect ofOK, I see what you want to accomplish. To fully address this issue, we will
user requested configuration - whether it configured it or not, and trying
to preserve that particular state as cpuset state changes. It addresses the
immediate problem but it is a very partial approach. Let's say a task wanna
be affined to one logical thread of each core and set its mask to 0x5555.
Now, let's say cpuset got enabled and enforced 0xff and affined the task to
0xff. After a while, the cgroup got more cpus allocated and its cpuset now
has 0xfff. Ideally, what should happen is the task now having the effective
mask of 0x555. In practice, tho, it either would get 0xf55 or 0x55 depending
on which way we decide to misbehave.
need to have a new cpumask variable in the the task structure which will be
allocated if sched_setaffinity() is ever called. I can rework my patch to
use this approach.
effective cpumask. Let's make sure that the scheduler folks are on board
before committing to the idea tho. Peter, Ingo, what do you guys think?
should be just an extra mask copy in sched_setaffinity() and a subset
check / cpumask_and() in set_cpus_allowed_ptr(). The policy side is a bit
less clear (when, if ever, do we clear the user-defined mask? Will it keep
haunting us even after moving a task to a disjoint cpuset partition?).
There's also if/how that new mask should be exposed, because attaching a
task to a cpuset will now yield a not-necessarily-obvious affinity -
e.g. in the thread affinity example above, if the initial affinity setting
was done ages ago by some system tool, IMO the user needs a way to be able
to expect/understand the result of 0x555 rather than 0xfff.
While I'm saying this, I don't think anything exposes p->user_cpus_ptr, but
then again that one is for "special" hardware...
Thanks.
--
tejun