Re: [PATCH v2] Revert "ACPI: Add memory semantics to acpi_os_map_memory()"

From: Rafael J. Wysocki
Date: Tue Sep 28 2021 - 13:27:17 EST

On 9/24/2021 11:04 AM, Lorenzo Pieralisi wrote:
On Thu, Sep 23, 2021 at 02:54:52PM +0200, Rafael J. Wysocki wrote:
On Thu, Sep 23, 2021 at 2:26 PM Mark Kettenis <mark.kettenis@xxxxxxxxx> wrote:
From: "Rafael J. Wysocki" <rafael@xxxxxxxxxx>
Date: Thu, 23 Sep 2021 13:05:05 +0200

On Thu, Sep 23, 2021 at 11:40 AM Lorenzo Pieralisi
<lorenzo.pieralisi@xxxxxxx> wrote:
On Thu, Sep 23, 2021 at 01:09:58AM +0200, Mark Kettenis wrote:
Date: Wed, 22 Sep 2021 17:33:36 +0100
From: Lorenzo Pieralisi <lorenzo.pieralisi@xxxxxxx>

On Fri, Sep 10, 2021 at 10:32:23PM +0800, Jia He wrote:
This reverts commit 437b38c51162f8b87beb28a833c4d5dc85fa864e.

After this commit, a boot panic is alway hit on an Ampere EMAG server
with call trace as follows:
Internal error: synchronous external abort: 96000410 [#1] SMP
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.14.0+ #462
Hardware name: MiTAC RAPTOR EV-883832-X3-0001/RAPTOR, BIOS 0.14 02/22/2019
pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
Call trace:

As mentioned by Lorenzo:
"We are forcing memory semantics mappings to PROT_NORMAL_NC, which
eMAG does not like at all and I'd need to understand why. It looks
like the issue happen in SystemMemory Opregion handler."

Hence just revert it before everything is clear.

Fixes: 437b38c51162 ("ACPI: Add memory semantics to acpi_os_map_memory()")
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@xxxxxxx>
Cc: Ard Biesheuvel <ardb@xxxxxxxxxx>
Cc: Hanjun Guo <guohanjun@xxxxxxxxxx>
Cc: Catalin Marinas <catalin.marinas@xxxxxxx>
Cc: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx>
Cc: Harb Abdulhamid <harb@xxxxxxxxxxxxxxxxxxx>

Signed-off-by: Jia He <justin.he@xxxxxxx>
Rewrote the commit log, please take the patch below and repost
it as a v3.

It would still be great if Ampere can help us understand why
the NormalNC attributes trigger a sync abort on the opregion
before merging it.
To be honest, I don't think you really need an explanation from Ampere
here. Mapping a part of the address space that doesn't provide memory
semantics with NormalNC attributes is wrong and triggering a sync
abort in that case is way better than silently ignoring the access.
That's understood and that's what I explained in the revert commit
log, no question about it.

I was just asking to confirm if that's what's actually happening.

Putting my OpenBSD hat on (where we have our own ACPI OSPM
implementation) I must say that we always interpreted SystemMemory as
memory mapped IO and I think that is a logical choice as SystemIO is
used for (non-memory mapped) IO. And I'd say that the ACPI OSPM code
should make sure that it uses properly aligned access to any Field
object that doesn't use AnyAcc as its access type. Even on x86! And
I'd say that AML that uses AnyAcc fields for SystemMemory OpRegions on
arm64 is buggy.

But maybe relaxing this when the EFI memory map indicates that the
address space in question does provide memory semantics does make
sense. That should defenitely be documented in the ACPI standard
Mapping SystemMemory Opregions as "memory" does not make sense
at all to me. Still, that's what Linux ACPICA code does (*if*
that's what acpi_os_map_memory() is supposed to mean).
It doesn't need to do that, though, if there are good enough arguments
to change the current behavior (and the argument here is that it may
be an MMIO region, so mapping it as memory doesn't really work, but it
also may be a region in memory - there is no rule in the spec by which
SystemMemory Opregions cannot be "memory" AFAICS) and if that change
doesn't introduce regressions in the installed base.

Where do we go from here, to be defined, we still have a bug
to fix after the revert is applied.


maps BERT error regions with acpi_os_map_memory().
That mechanism is basically used for exporting ACPI tables to user
space and they are known to reside in memory. Whether or not BERT
regions should be mapped in the same way is a good question.
It is not inconceivable that BERT regions actually live in memory of
the BMC that is exposed over a bus that doesn't implement memory
semantics is it?
No, it isn't, which is why I think that mapping them as RAM may not be
a good idea in general.
Should I patch acpi_data_show() to map BERT error regions (well, that's
what acpi_data_show() is used on at the moment) as MMIO and use the
related memcpy routine to read them then :) ?

It actually would be good to clean it up so it is clear that this is only used for BERT.

And then there is this question: if this is not RAM (so effectively it is device memory), should it be exposed directly to user space?