Re: [PATCH v9 01/26] arm64: Fix HCR.TGE status for NMI contexts

From: Christoffer Dall
Date: Thu Jan 31 2019 - 03:19:05 EST


On Mon, Jan 28, 2019 at 03:42:42PM +0000, Julien Thierry wrote:
> Hi James,
>
> On 28/01/2019 11:48, James Morse wrote:
> > Hi Julien,
> >
> > On 21/01/2019 15:33, Julien Thierry wrote:
> >> When using VHE, the host needs to clear HCR_EL2.TGE bit in order
> >> to interract with guest TLBs, switching from EL2&0 translation regime
> >
> > (interact)
> >
> >
> >> to EL1&0.
> >>
> >> However, some non-maskable asynchronous event could happen while TGE is
> >> cleared like SDEI. Because of this address translation operations
> >> relying on EL2&0 translation regime could fail (tlb invalidation,
> >> userspace access, ...).
> >>
> >> Fix this by properly setting HCR_EL2.TGE when entering NMI context and
> >> clear it if necessary when returning to the interrupted context.
> >
> > Yes please. This would not have been fun to debug!
> >
> > Reviewed-by: James Morse <james.morse@xxxxxxx>
> >
> >
>
> Thanks.
>
> >
> > I was looking for why we need core code to do this, instead of updating the
> > arch's call sites. Your 'irqdesc: Add domain handlers for NMIs' patch (pointed
> > to from the cover letter) is the reason: core-code calls nmi_enter()/nmi_exit()
> > itself.
> >
>
> Yes, that's the main reason.
>
I wondered the same thing, but I don't understand the explanation :(

Why can't we do a local_daif_mask() around the (very small) calls that
clear TGE instead?


Thanks,

Christoffer