Re: [PATCH 2/2] acpi, apei: use appropriate pgprot_t to map GHES memory

From: Zhang, Jonathan Zhixiong
Date: Mon Aug 24 2015 - 14:22:58 EST




On 8/22/2015 2:24 AM, Ingo Molnar wrote:

* Jonathan (Zhixiong) Zhang <zjzhang@xxxxxxxxxxxxxx> wrote:

From: "Jonathan (Zhixiong) Zhang" <zjzhang@xxxxxxxxxxxxxx>

With ACPI APEI firmware first handling, generic hardware error
record is updated by firmware in GHES memory region. On an arm64
platform, firmware updates GHES memory region with uncached
access attribute, and then Linux reads stale data from cache.

This paragraph *still* doesn't parse for me. It's not any English
I can recognize: what is a 'With ACPI APEI firmware first handling'?
APEI is ACPI Platform Error Interface; it is part of ACPI spec,
defining the aspect of hardware error handling. "firmware first
handling" is a terminology used in APEI. It describes such mechanism
that when hardware error happens, firmware intersects/handles such
hardware error, formulates hardware error record and writes the record
to GHES memory region, notifies the kernel through NMI/interrupt, then
the kernel GHES driver grabs the error record from the GHES memory
region.


With current code, GHES memory region is mapped with PAGE_KERNEL
based on the assumption that cache coherency of GHES memory region
is maintained by firmware on all platforms. This assumption is
not true for above mentioned arm64 platform.

Instead GHES memory region should be mapped with page protection type
according to what is returned from arch_apei_get_mem_attribute().

... plus what this changelog still doesn't mention is the most important part of
any bug fix description: how does the user notice this in practice and why does he
care?
The changelog mentioned that Linux would read stale data from cache.
When stale data is read, kernel reports there is no new hardware error
when there actually is. This may lead to further damage in various
scenarios, such as error propagation caused data corruption.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-efi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html


--
Jonathan (Zhixiong) Zhang
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/