Re: [PATCH v4 2/2] ACPICA: Preserve memory opregion mappings
From: Al Stone
Date: Tue Jun 30 2020 - 15:57:14 EST
On 30 Jun 2020 17:52, Rafael J. Wysocki wrote:
> On Tue, Jun 30, 2020 at 5:31 PM Al Stone <ahs3@xxxxxxxxxx> wrote:
> >
> > On 30 Jun 2020 13:44, Rafael J. Wysocki wrote:
> > > On Mon, Jun 29, 2020 at 10:57 PM Al Stone <ahs3@xxxxxxxxxx> wrote:
> > > >
> > > > On 29 Jun 2020 18:33, Rafael J. Wysocki wrote:
> > > > > From: "Rafael J. Wysocki" <rafael.j.wysocki@xxxxxxxxx>
> > > > >
> > > > > The ACPICA's strategy with respect to the handling of memory mappings
> > > > > associated with memory operation regions is to avoid mapping the
> > > > > entire region at once which may be problematic at least in principle
> > > > > (for example, it may lead to conflicts with overlapping mappings
> > > > > having different attributes created by drivers). It may also be
> > > > > wasteful, because memory opregions on some systems take up vast
> > > > > chunks of address space while the fields in those regions actually
> > > > > accessed by AML are sparsely distributed.
> > > > >
> > > > > For this reason, a one-page "window" is mapped for a given opregion
> > > > > on the first memory access through it and if that "window" does not
> > > > > cover an address range accessed through that opregion subsequently,
> > > > > it is unmapped and a new "window" is mapped to replace it. Next,
> > > > > if the new "window" is not sufficient to acess memory through the
> > > > > opregion in question in the future, it will be replaced with yet
> > > > > another "window" and so on. That may lead to a suboptimal sequence
> > > > > of memory mapping and unmapping operations, for example if two fields
> > > > > in one opregion separated from each other by a sufficiently wide
> > > > > chunk of unused address space are accessed in an alternating pattern.
> > > > >
> > > > > The situation may still be suboptimal if the deferred unmapping
> > > > > introduced previously is supported by the OS layer. For instance,
> > > > > the alternating memory access pattern mentioned above may produce
> > > > > a relatively long list of mappings to release with substantial
> > > > > duplication among the entries in it, which could be avoided if
> > > > > acpi_ex_system_memory_space_handler() did not release the mapping
> > > > > used by it previously as soon as the current access was not covered
> > > > > by it.
> > > > >
> > > > > In order to improve that, modify acpi_ex_system_memory_space_handler()
> > > > > to preserve all of the memory mappings created by it until the memory
> > > > > regions associated with them go away.
> > > > >
> > > > > Accordingly, update acpi_ev_system_memory_region_setup() to unmap all
> > > > > memory associated with memory opregions that go away.
> > > > >
> > > > > Reported-by: Dan Williams <dan.j.williams@xxxxxxxxx>
> > > > > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx>
> > > > > ---
> > > > > drivers/acpi/acpica/evrgnini.c | 14 ++++----
> > > > > drivers/acpi/acpica/exregion.c | 65 ++++++++++++++++++++++++----------
> > > > > include/acpi/actypes.h | 12 +++++--
> > > > > 3 files changed, 64 insertions(+), 27 deletions(-)
> > > > >
> > > > > diff --git a/drivers/acpi/acpica/evrgnini.c b/drivers/acpi/acpica/evrgnini.c
> > > > > index aefc0145e583..89be3ccdad53 100644
> > > > > --- a/drivers/acpi/acpica/evrgnini.c
> > > > > +++ b/drivers/acpi/acpica/evrgnini.c
> > > > > @@ -38,6 +38,7 @@ acpi_ev_system_memory_region_setup(acpi_handle handle,
> > > > > union acpi_operand_object *region_desc =
> > > > > (union acpi_operand_object *)handle;
> > > > > struct acpi_mem_space_context *local_region_context;
> > > > > + struct acpi_mem_mapping *mm;
> > > > >
> > > > > ACPI_FUNCTION_TRACE(ev_system_memory_region_setup);
> > > > >
> > > > > @@ -46,13 +47,14 @@ acpi_ev_system_memory_region_setup(acpi_handle handle,
> > > > > local_region_context =
> > > > > (struct acpi_mem_space_context *)*region_context;
> > > > >
> > > > > - /* Delete a cached mapping if present */
> > > > > + /* Delete memory mappings if present */
> > > > >
> > > > > - if (local_region_context->mapped_length) {
> > > > > - acpi_os_unmap_memory(local_region_context->
> > > > > - mapped_logical_address,
> > > > > - local_region_context->
> > > > > - mapped_length);
> > > > > + while (local_region_context->first_mm) {
> > > > > + mm = local_region_context->first_mm;
> > > > > + local_region_context->first_mm = mm->next_mm;
> > > > > + acpi_os_unmap_memory(mm->logical_address,
> > > > > + mm->length);
> > > > > + ACPI_FREE(mm);
> > > > > }
> > > > > ACPI_FREE(local_region_context);
> > > > > *region_context = NULL;
> > > > > diff --git a/drivers/acpi/acpica/exregion.c b/drivers/acpi/acpica/exregion.c
> > > > > index d15a66de26c0..fd68f2134804 100644
> > > > > --- a/drivers/acpi/acpica/exregion.c
> > > > > +++ b/drivers/acpi/acpica/exregion.c
> > > > > @@ -41,6 +41,7 @@ acpi_ex_system_memory_space_handler(u32 function,
> > > > > acpi_status status = AE_OK;
> > > > > void *logical_addr_ptr = NULL;
> > > > > struct acpi_mem_space_context *mem_info = region_context;
> > > > > + struct acpi_mem_mapping *mm = mem_info->cur_mm;
> > > > > u32 length;
> > > > > acpi_size map_length;
> > > >
> > > > I think this needs to be:
> > > >
> > > > acpi_size map_length = mem_info->length;
> > > >
> > > > since it now gets used in the ACPI_ERROR() call below.
> > >
> > > No, it's better to print the length value in the message.
> >
> > Yeah, that was the other option.
> >
> > > > I'm getting a "maybe used unitialized" error on compilation.
> > >
> > > Thanks for reporting!
> > >
> > > I've updated the commit in the acpica-osl branch with the fix.
> >
> > Thanks, Rafael.
> >
> > Do you have a generic way of testing this? I can see a way to do it
> > -- timing a call of a method in a dynamically loaded SSDT -- but if
> > you had a test case laying around, I could continue to be lazy :).
>
> I don't check the timing, but instrument the code to see if what
> happens is what is expected.
Ah, okay. Thanks.
> Now, the overhead reduction resulting from this change in Linux is
> quite straightforward: Every time the current mapping doesn't cover
> the request at hand, an unmap is carried out by the original code,
> which involves a linear search through acpi_ioremaps, and which
> generally is (at least a bit) more expensive than the linear search
> through the list of opregion-specific mappings introduced by the
> $subject patch, because quite likely the acpi_ioremaps list holds more
> items. And, of course, if the opregion in question holds many fields
> and they are not covered by one mapping, each of them needs to be
> mapped just once per the opregion life cycle.
Right. What I was debating as a generic test was something to try to
force an OpRegion through mapping and unmapping repeatedly with the
current code to determine a rough average elapsed time. Then, apply
the patch to see what the change does. Granted, a completely synthetic
scenario, and specifically designed to exaggerate the overhead, but
I'm just curious.
--
ciao,
al
-----------------------------------
Al Stone
Software Engineer
Red Hat, Inc.
ahs3@xxxxxxxxxx
-----------------------------------