Re: [PATCH 3/3] x86/efi: Use efi_switch_mm() rather than manually twiddling with cr3

From: Mark Rutland
Date: Wed Aug 16 2017 - 05:54:56 EST


On Wed, Aug 16, 2017 at 10:31:12AM +0100, Ard Biesheuvel wrote:
> (+ Mark, Will)
>
> On 15 August 2017 at 22:46, Andy Lutomirski <luto@xxxxxxxxxx> wrote:
> > On Tue, Aug 15, 2017 at 12:18 PM, Sai Praneeth Prakhya
> > <sai.praneeth.prakhya@xxxxxxxxx> wrote:
> >> +/*
> >> + * Makes the calling kernel thread switch to/from efi_mm context
> >> + * Can be used from SetVirtualAddressMap() or during efi runtime calls
> >> + * (Note: This routine is heavily inspired from use_mm)
> >> + */
> >> +void efi_switch_mm(struct mm_struct *mm)
> >> +{
> >> + struct task_struct *tsk = current;
> >> +
> >> + task_lock(tsk);
> >> + efi_scratch.prev_mm = tsk->active_mm;
> >> + if (efi_scratch.prev_mm != mm) {
> >> + mmgrab(mm);
> >> + tsk->active_mm = mm;
> >> + }
> >> + switch_mm(efi_scratch.prev_mm, mm, NULL);
> >> + task_unlock(tsk);
> >> +
> >> + if (efi_scratch.prev_mm != mm)
> >> + mmdrop(efi_scratch.prev_mm);
> >
> > I'm confused. You're mmdropping an mm that you are still keeping a
> > pointer to. This is also a bit confusing in the case where you do
> > efi_switch_mm(efi_scratch.prev_mm).
> >
> > This whole manipulation seems fairly dangerous to me for another
> > reason -- you're taking a user thread (I think) and swapping out its
> > mm to something that the user in question should *not* have access to.
> > What if a perf interrupt happens while you're in the alternate mm?
> > What if you segfault and dump core? Should we maybe just have a flag
> > that says "this cpu is using a funny mm", assert that the flag is
> > clear when scheduling, and teach perf, coredumps, etc not to touch
> > user memory when the flag is set?
>
> It appears we may have introduced this exact issue on arm64 and ARM by
> starting to run the UEFI runtime services with interrupts enabled.
> (perf does not use NMI on ARM, so the issue did not exist beforehand)
>
> Mark, Will, any thoughts?

Yup, I can cause perf to take samples from the EFI FW code, so that's
less than ideal.

The "funny mm" flag sounds like a good idea to me, though given recent
pain with sampling in the case of skid, I don't know exactly what we
should do if/when we take an overflow interrupt while in EFI.

Mark.