Re: [PATCH V7 04/10] arm64: exception: handle Synchronous External Abort

From: Baicar, Tyler
Date: Wed Jan 18 2017 - 18:26:33 EST


Hello James,


On 1/17/2017 3:31 AM, James Morse wrote:
Hi Tyler,

On 12/01/17 18:15, Tyler Baicar wrote:
SEA exceptions are often caused by an uncorrected hardware
error, and are handled when data abort and instruction abort
exception classes have specific values for their Fault Status
Code.
When SEA occurs, before killing the process, go through
the handlers registered in the notification list.
Update fault_info[] with specific SEA faults so that the
new SEA handler is used.
@@ -480,6 +496,28 @@ static int do_bad(unsigned long addr, unsigned int esr, struct pt_regs *regs)
return 1;
}
+/*
+ * This abort handler deals with Synchronous External Abort.
+ * It calls notifiers, and then returns "fault".
+ */
+static int do_sea(unsigned long addr, unsigned int esr, struct pt_regs *regs)
+{
+ struct siginfo info;
+
+ atomic_notifier_call_chain(&sea_handler_chain, 0, NULL);
+
+ pr_err("Synchronous External Abort: %s (0x%08x) at 0x%016lx\n",
+ fault_name(esr), esr, addr);
+
+ info.si_signo = SIGBUS;
+ info.si_errno = 0;
+ info.si_code = 0;
Half of the other do_*() functions in this file read the signo and code from the
fault_info table.


+ info.si_addr = (void __user *)addr;
addr here was read from FAR_EL1, but for some of the classes of exception you
have listed below this register isn't updated with the faulting address.

The ARM-ARM version 'k' in D1.10.5 "Summary of registers on faults taken to an
Exception level that is using Aarch64" has:
The architecture permits that the FAR_ELx is UNKNOWN for Synchronous External
Aborts other than Synchronous External Aborts on Translation Table Walks. In
this case, the ISS.FnV bit returned in ESR_ELx indicates whether FAR_ELx is
valid.
This is a problem if we get 'synchronous external abort' or 'synchronous parity
error' while a user space process was running.
It looks like this would just cause an incorrect address to be printed in the above pr_err.
Unless I'm missing something, I don't see arm64_notify_die or anything that gets called from
there using the info.si_addr variable.

What do you suggest I do here? The firmware should be reporting the physical and virtual
address information if it is available in the HEST entry that the kernel will parse. So should I
just remove the use of the addr parameter in do_sea?

Thanks,
Tyler
+ arm64_notify_die("", regs, &info, esr);
+
+ return 0;
+}
+
static const struct fault_info {
int (*fn)(unsigned long addr, unsigned int esr, struct pt_regs *regs);
int sig;

Thanks,

James



--
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.