Re: [BUG] x86/efi: MMRs no longer properly mapped after switch to isolated page table

From: Alex Thorlton
Date: Wed May 04 2016 - 12:33:00 EST

On Wed, May 04, 2016 at 12:36:36PM +0200, Borislav Petkov wrote:
> On Tue, May 03, 2016 at 01:47:51PM -0500, Alex Thorlton wrote:
> > I think this will work for us, for the most part. Only issue is that
> > the efi_call_virt macro is only accessible from inside
> > runtime-wrappers.c. If we could pull that macro (and whatever else it
> > needs) up to the header file, I think that might work for us. Not sure
> > if that's the appropriate solution, but it's a start.
> Should be doable. You could give it a try and see how ugly it can get.

I can do that. I don't think it should be too bad - I just wanted to
make sure that was an appropriate move before starting to work on it.

> > Yes, I do have CONFIG_EFI_PGT_DUMP=y. I don't *think* I see anything
> > strange in there, but I could be missing something. I will send you a
> > full dump of my log buffer wit MLs et. al. off of Cc.
> Sure.

I am sending this shortly. Yesterday evening got away from me :)

> > Take note that the Oops bits here indicate that it was a *write* from
> > kernel space that triggered this most recent Oops, whereas the ones we
> > were hitting before were all just missing pages in the mappings.
> >
> > This means my suggestiong about the "if(efi_scratch..." bit was wrong.
> > This issue is still rolling around in my head. I'll address it below.
> One thing I don't see in your uv_call_virt() is you're not grabbing
> efi_runtime_lock like the rest of the EFI callers do. And there's
> __wake_up_common() somewhere there in the callstack, not on the current
> frame but there's also another uv_bios_call() in there and this all
> looks like some locking issue...
> So please convert it to the generic one first, do the calls as runtime
> services in drivers/firmware/efi/runtime-wrappers.c do and we can
> continue debugging.

Got it.

> > This is probably the answer for the future, when we can expect the
> > changes to these macros be merged with the mainline kernel, but I don't
> > know exactly how long it will be before that happens.
> What's the hurry exactly here? You want stuff fixed in 4.6 when it
> releases in less than two weeks?

Well, in a perfect world, yes. I realize that might be a bit of a
stretch, but we'd *really* prefer to have 4.6 not be outright broken. I
think we might be able to get at least a few small fixes through to at
least get our machines booting. If worse comes to worse, we can get the
fixes into -tip and then wrap back around and try to fix up 4.6 in a
later stable kernel release. I guess the best we can do is try to work
quickly and see where things end up.

> Lemme try to understand the fallout range: that's only UV1 or UV3 too?
> Because the latest oops comes from UV3...
> If it is UV1 only, I'd say we don't care since you guys wanted to even
> kill that support :-)

Sorry, I may not have made this clear. Currently *all* UVs *except* for
UV1s are broken. All of the testing I've done since we started
discussing this issue has been done on a UV3000, but everything >= UV2
is currently broken.

> Btw, does "efi=old_memmap" fix things and could it be used as an interim
> workaround until we've fixed everything properly and stuff has trickled
> into -stable.?

Unfortunately, without the call for map_low_mmrs, even that doesn't
work. I think that's an easy fix that we might be able to get in for
4.6 though. It's literally a one-liner. I'm going to try to get that
out today, so at least our old workaround still works. I think it might
still have some trouble with modules doing EFI calls, but I'd be at
least halfway happy if the machine boots :)

- Alex