Re: [PATCH 3/3] context_tracking,x86: remove extraneous irq disable & enable from context tracking on syscall entry

From: Ingo Molnar
Date: Thu May 07 2015 - 06:35:40 EST



* Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote:

> >> [...] Also, we'd have to audit all the entries, and
> >> system_call_after_swapgs currently enables interrupts early
> >> enough that an interrupt before all the pushes will do
> >> unpredictable things to pt_regs.
> >
> > An irq hardware frame won't push zero to that selector value,
> > right? That's the only bad thing that would confuse the code.
> >
>
> I think it's not quite that simple. The syscall entry looks like,
> roughly:
>
> fix rsp;
> sti;
> push ss;
> push rsp;
> push flags;
> push cs;
> push rip;
>
> We can get an interrupt part-way through those pushes. Maybe there's
> no bad place where we could get an IRQ since SS is first, [...]

... and it should be covered by the 'STI window' where the instruction
following a STI is still part of the irqs-off section.

> [...] but this is still nasty.

True.

Another approach would be to set up two aliases in the GDT, so we
could freely change 'ss' between them and thus store information,
without possibly confusing the syscall entry/exit code.

That still gets complicated by IST entries, which creates multiple
positions for the 'flag'.

> I think I prefer a per-cpu approach over a per-task approach, since
> it's easier to reason about and it should still only require one
> instruction on entry and one instruction on exit.

Yes, agreed.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/