Re: [patch V3 01/13] entry: Provide generic syscall entry functionality

From: Andy Lutomirski
Date: Sat Jul 18 2020 - 10:41:25 EST


On Sat, Jul 18, 2020 at 7:16 AM Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
>
> Andy Lutomirski <luto@xxxxxxxxxx> writes:
> > On Fri, Jul 17, 2020 at 12:29 PM Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
> >> The alternative is to play nasty games with TIF_IA32, TIF_ADDR32 and
> >> TIF_X32 to free up bits for 32bit and make the flags field 64 bit on 64
> >> bit kernels, but I prefer to do the above seperation.
> >
> > I'm all for cleaning it up, but I don't think any nasty games would be
> > needed regardless. IMO at least the following flags are nonsense and
> > don't belong in TIF_anything at all:
> >
> > TIF_IA32, TIF_X32: can probably be deleted. Someone would just need
> > to finish the work.
> > TIF_ADDR32: also probably removable, but I'm less confident.
> > TIF_FORCED_TF: This is purely a ptrace artifact and could easily go
> > somewhere else entirely.
> >
> > So getting those five bits back would be straightforward.
> >
> > FWIW, TIF_USER_RETURN_NOTIFY is a bit of an odd duck: it's an
> > entry/exit word *and* a context switch word. The latter is because
> > it's logically a per-cpu flag, not a per-task flag, and the context
> > switch code moves it around so it's always set on the running task.
>
> Gah, I missed the context switch thing of that. That stuff is hideous.

It's also delightful because anything that screws up that dance (such
as failure to do the exit-to-usermode path exactly right) likely
results in an insta-root-hole. If we fail to run user return
notifiers, we can run user code with incorrect syscall MSRs, etc.