Re: [PATCH] 4.4.86-rt99: fix sync breakage between nr_cpus_allowed and cpus_allowed

From: Steven Rostedt
Date: Mon Nov 20 2017 - 23:02:17 EST


On Mon, 20 Nov 2017 11:30:40 -0500
joe.korty@xxxxxxxxxxxxxxxxx wrote:

> Hi Steve,
> A quick perusal of 4.11.12-rt16 shows that it has an
> entirely new version of migrate_disable which to me appears
> correct.
>
> In that new implementation, migrate_enable() recalculates
> p->nr_cpus_allowed when it switches the task back to
> using p->cpus_mask. This brings the two back into sync
> if anything had happened to get them out of sync while
> migration was disabled (as would happen on an affinity
> change during that disable period).
>
> 4.9.47-rt37 has the old implementation and it appears to
> have same bug as 4.4-rt though I have yet to test 4.9-rt.
>
> The fix in these older versions could take one of two
> forms: either we recalculate p->nr_cpus_allowed when
> migrate_enable goes back to using p->cpus_allowed,
> as the 4.11-rt version does, or the one place where we
> allow p->nr_cpus_allowed to diverge from p->cpus_allowed
> be fixed. The patch I submitted earlier takes this second
> approach.
>

Ideally, I would like to stay close to what upstream -rt does. Would
you be able to backport the 4.11-rt patch?

I'm currently working on releasing 4.9-rt and 4.4-rt with the latest
backports. I could easily add this one too.

-- Steve