Re: [PATCH v2] x86: Preserve iopl on fork and execve

From: Ingo Molnar
Date: Tue May 12 2015 - 02:40:43 EST

* Alex Henrie <alexhenrie24@xxxxxxxxx> wrote:

> Signed-off-by: Alex Henrie <alexhenrie24@xxxxxxxxx>
> Suggested-by: Doug Johnson <dougvj@xxxxxxxxxx>
> ---
> arch/x86/kernel/process_32.c | 2 +-
> arch/x86/kernel/process_64.c | 2 +-
> 2 files changed, 2 insertions(+), 2 deletions(-)
> diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c
> index 8ed2106..0ef7078 100644
> --- a/arch/x86/kernel/process_32.c
> +++ b/arch/x86/kernel/process_32.c
> @@ -205,7 +205,7 @@ start_thread(struct pt_regs *regs, unsigned long new_ip, unsigned long new_sp)
> regs->cs = __USER_CS;
> regs->ip = new_ip;
> regs->sp = new_sp;
> - regs->flags = X86_EFLAGS_IF;
> + regs->flags = X86_EFLAGS_IF | (X86_EFLAGS_IOPL & regs->flags);
> force_iret();
> }
> EXPORT_SYMBOL_GPL(start_thread);
> diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
> index ddfdbf7..e21eda2 100644
> --- a/arch/x86/kernel/process_64.c
> +++ b/arch/x86/kernel/process_64.c
> @@ -238,7 +238,7 @@ start_thread_common(struct pt_regs *regs, unsigned long new_ip,
> regs->sp = new_sp;
> regs->cs = _cs;
> regs->ss = _ss;
> - regs->flags = X86_EFLAGS_IF;
> + regs->flags = X86_EFLAGS_IF | (X86_EFLAGS_IOPL & regs->flags);
> force_iret();
> }

Yeah, NAK.

So this patch could be an instant roothole on some setups: assume old
64-bit apps relying on fork/clone/execve effectively flushing these
capabilities and we'll now leak powerful hardware access permissions
into child contexts that never had it before ...

I realize that this is a 2.5+ years old regression on 32-bit x86, and
that the prior inheritance of iopl/ioperm was broken accidentally on
32-bit kernels by:

6783eaa2e125 ("x86, um/x86: switch to generic sys_execve and kernel_execve")

My arguments in favor of doing nothing are:

- Nothing actually broke that people cared about in the last 2.5
years, thus this might be one of the (very very rare) cases where
preserving a breakage is the right thing to do.

- There's no reason to export this behavior to 64-bit x86 which
apparently never had the iopl/ioperm capabilities propagation.

- Furthermore, even new 32-bit apps might have (accidentally) learned
the new ABI, and we'd now break _them_, possibly in subtle ways.

- Plus iopl() and ioperm() are one of the most dangerous kernel APIs
we have and the accidental limiting of them, which we got away with
for 2.5+ years without being reportd, might just be what we want to
stick with. An aspect of an API is only an ABI if it's actually
used by applications.

- These syscalls are rarely used, and we could as well insist that
every new context should have the permissions to (re-)acquire them
and should actively seek them - instead of inheriting it to shells
via system(), etc. The best strategy with dangerous APIs is to make
it really, really explicit when they are used.

Permission propagation breakages like this are a rare situation and
there's really no good way to fix them: damned if you do, damned if
you don't.

So without far more analysis and far more care (a zero-length
changelog won't cut it!) I doubt we can - or event want to - do
anything like this...


To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at