Re: [PATCH v3 7/8] arm64: exception: handle asynchronous SError interrupt
From: Xiongfeng Wang
Date: Tue Apr 18 2017 - 22:44:24 EST
Hi James,
Thanks for your reply.
On 2017/4/18 18:51, James Morse wrote:
> Hi Wang Xiongfeng,
>
> On 18/04/17 02:09, Xiongfeng Wang wrote:
>> I have some confusion about the RAS feature when VHE is enabled. Does RAS spec support
>> the situation when VHE is enabled. When VHE is disabled, the hyperviosr delegates the error
>> exception to EL1 by setting HCR_EL2.VSE to 1, and this will inject a virtual SEI into OS.
>
> (The ARM-ARM also requires the HCR_EL2.AMO to be set so that physical SError
> Interrupts are taken to EL2, meaning EL1 can never receive a physical SError)
>
>
>> My understanding is that HCR_EL2.VSE is only used to inject a virtual SEI into EL1.
>
> ... mine too ...
>
>> But when VHE is enabled, the host OS will run at EL2. We can't inject a virtual SEI into
>> host OS. I don't know if RAS spec can handle this situation.
>
> The host expects to receive physical SError Interrupts. The ARM-ARM doesn't
> describe a way to inject these as they are generated by the CPU.
>
> Am I right in thinking you want this to use SError Interrupts as an APEI
> notification? (This isn't a CPU thing so the RAS spec doesn't cover this use)
Yes, using sei as an APEI notification is one part of my consideration. Another use is for ESB.
RAS spec 6.5.3 'Example software sequences: Variant: asynchronous External Abort with ESB'
describes the SEI recovery process when ESB is implemented.
In this situation, SEI is routed to EL3 (SCR_EL3.EA = 1). When an SEI occurs in EL0 and not been taken immediately,
and then an ESB instruction at SVC entry is executed, SEI is taken to EL3. The ESB at SVC entry is
used for preventing the error propagating from user space to kernel space. The EL3 SEI handler collects
the errors and fills in the APEI table, and then jump to EL2 SEI handler. EL2 SEI handler inject
an vSEI into EL1 by setting HCR_EL2.VSE = 1, so that when returned to OS, an SEI is pending.
Then ESB is executed again, and DISR_EL1.A is set by hardware (2.4.4 ESB and virtual errors), so that
the following process can be executed.
So we want to inject a vSEI into OS, when control is returned from EL3/2 to OS, no matter whether
it is on host OS or guest OS. I don't know if my understanding is right here.
>
> This is straightforward for the hyper-visor to implement using Virtual SError.
> I don't think its not always feasible for the host as Physical SError is routed
> to EL3 by SCR_EL3.EA, meaning there is no hardware generated SError that can
> reach EL2. Another APEI notification mechanism may be more appropriate.
>
> EL3 may be able to 'fake' an SError by returning into the appropriate EL2 vector
> if the exception came from EL{0,1}, or from EL2 and PSTATE.A is clear.
> If the SError came from EL2 and the ESR_EL3.IESB bit is set, we can write an
> appropriate ESR into DISR.
Yes, this can work. When VHE is enabled, we can set DISR.A by software, and 'fake'
an SError by returning into the EL2 SEI vector.
> You cant use SError to cover all the possible RAS exceptions. We already have
> this problem using SEI if PSTATE.A was set and the exception was an imprecise
> abort from EL2. We can't return to the interrupted context and we can't deliver
> an SError to EL2 either.
SEI came from EL2 and PSTATE.A is set. Is it the situation where VHE is enabled and CPU is running
in kernel space. If SEI occurs in kernel space, can we just panic or shutdown.
>
> Setting SCR_EL3.EA allows firmware to handle these ugly corner cases. Notifying
> the OS is a separate problem where APEI's SEI may not always be the best choice.
>
>
> Thanks,
>
> James
>
> .
>
Thanks,
Wang Xiongfeng