Re: [patch V4 4/8] sched: Make migrate_disable/enable() independent of RT

From: Peter Zijlstra
Date: Fri Nov 20 2020 - 04:29:52 EST

Next message: Alex Shi: "[PATCH next-akpm] mm/memcg: add missed warning in mem_cgroup_lruvec"
Previous message: David Hildenbrand: "Re: [PATCH] mm,hugetlb: Remove unneded initialization"
In reply to: Thomas Gleixner: "Re: [patch V4 4/8] sched: Make migrate_disable/enable() independent of RT"
Next in thread: Andy Lutomirski: "Re: [patch V4 4/8] sched: Make migrate_disable/enable() independent of RT"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Fri, Nov 20, 2020 at 02:33:58AM +0100, Thomas Gleixner wrote:
> On Thu, Nov 19 2020 at 19:28, Peter Zijlstra wrote:
> > On Thu, Nov 19, 2020 at 09:23:47AM -0800, Linus Torvalds wrote:
> >> Because this is certainly not the only time migration limiting has
> >> come up, and no, it has absolutely nothing to do with per-cpu page
> >> tables being completely unacceptable.
> >
> > It is for this instance; but sure, it's come up before in other
> > contexts.
>
> Indeed. And one of the really bad outcomes of this is that people are
> forced to use preempt_disable() to prevent migration which entails a
> slew of consequences:
>
> - Using spinlocks where it wouldn't be needed otherwise
> - Spinwaiting instead of sleeping
> - The whole crazyness of doing copy_to/from_user_in_atomic() along
> with the necessary out of line error handling.
> - ....
>
> The introduction of per-cpu storage happened almost 20 years ago (2002)
> and still the only answer we have is preempt_disable().

IIRC the first time this migrate_disable() stuff came up was when Chris
Lameter did SLUB. Eventually he settled for that cmpxchg_double()
approach (which is somewhat similar to userspace rseq) which is vastly
superiour and wouldn't have happened had we provided migrate_disable().

As already stated, per-cpu page-tables would allow for a much saner kmap
approach, but alas, x86 really can't sanely do that (the archs that have
separate kernel and user page-tables could do this, and how we cursed
x86 didn't have that when meltdown happened).

[ and using fixmaps in the per-cpu memory space _could_ work, but is a
giant pain because then all accesses need GS prefix and blah... ]

And I'm sure there's creative ways for other problems too, but yes, it's
hard.

Anyway, clearly I'm the only one that cares, so I'll just crawl back
under my rock...

Next message: Alex Shi: "[PATCH next-akpm] mm/memcg: add missed warning in mem_cgroup_lruvec"
Previous message: David Hildenbrand: "Re: [PATCH] mm,hugetlb: Remove unneded initialization"
In reply to: Thomas Gleixner: "Re: [patch V4 4/8] sched: Make migrate_disable/enable() independent of RT"
Next in thread: Andy Lutomirski: "Re: [patch V4 4/8] sched: Make migrate_disable/enable() independent of RT"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]