RE: 174cc7187e6f ACPICA: Tables: Back port acpi_get_table_with_size() and early_acpi_os_unmap_memory() from Linux kernel
From: Zheng, Lv
Date: Mon Jan 09 2017 - 00:21:43 EST
Hi, Borislav
> From: Zheng, Lv
> Subject: RE: 174cc7187e6f ACPICA: Tables: Back port acpi_get_table_with_size() and
> early_acpi_os_unmap_memory() from Linux kernel
>
> Hi,
>
> > From: linux-acpi-owner@xxxxxxxxxxxxxxx [mailto:linux-acpi-owner@xxxxxxxxxxxxxxx] On Behalf Of Zheng,
> > Lv
> > Subject: RE: 174cc7187e6f ACPICA: Tables: Back port acpi_get_table_with_size() and
> > early_acpi_os_unmap_memory() from Linux kernel
> >
> > Hi,
> >
> > > From: linux-acpi-owner@xxxxxxxxxxxxxxx [mailto:linux-acpi-owner@xxxxxxxxxxxxxxx] On Behalf Of
> > Borislav
> > > Petkov
> > > Subject: Re: 174cc7187e6f ACPICA: Tables: Back port acpi_get_table_with_size() and
> > > early_acpi_os_unmap_memory() from Linux kernel
> > >
> > > On Sun, Jan 08, 2017 at 03:20:20AM +0100, Rafael J. Wysocki wrote:
> > > > drivers/iommu/amd_iommu_init.c | 2 +-
> > > > 1 file changed, 1 insertion(+), 1 deletion(-)
> > > >
> > > > Index: linux-pm/drivers/iommu/amd_iommu_init.c
> > > > ===================================================================
> > > > --- linux-pm.orig/drivers/iommu/amd_iommu_init.c
> > > > +++ linux-pm/drivers/iommu/amd_iommu_init.c
> > > > @@ -2230,7 +2230,7 @@ static int __init early_amd_iommu_init(v
> > > > */
> > > > ret = check_ivrs_checksum(ivrs_base);
> > > > if (ret)
> > > > - return ret;
> > > > + goto out;
> > > >
> > > > amd_iommu_target_ivhd_type = get_highest_supported_ivhd_type(ivrs_base);
> > > > DUMP_printk("Using IVHD type %#x\n", amd_iommu_target_ivhd_type);
> > >
> > > Good catch, this one needs to be applied regardless.
> > >
> > > However, it doesn't fix my issue though.
> > >
> > > But I think I have it - I went and applied the well-proven debugging
> > > technique of sprinkling printks around. Here's what I'm seeing:
> > >
> > > early_amd_iommu_init()
> > > |-> acpi_put_table(ivrs_base);
> > > |-> acpi_tb_put_table(table_desc);
> > > |-> acpi_tb_invalidate_table(table_desc);
> > > |-> acpi_tb_release_table(...)
> > > |-> acpi_os_unmap_memory
> > > |-> acpi_os_unmap_iomem
> > > |-> acpi_os_map_cleanup
> > > |-> synchronize_rcu_expedited <-- the kernel/rcu/tree_exp.h version with CONFIG_PREEMPT_RCU=y
> > >
> > > Now that function goes and sends IPIs, i.e., schedule_work()
> > > but this is too early - we haven't even done workqueue_init().
> > > Actually, from looking at the callstack, we do
> > > kernel_init_freeable->native_smp_prepare_cpus() and workqueue_init()
> > > comes next.
> > >
> > > And this makes sense because the splat rIP points to __queue_work() but
> > > we haven't done that yet.
> > >
> > > So that acpi_put_table() is happening too early. Looks like AMD IOMMU
> > > should not put the table but WTH do I know?!
> > >
> > > In any case, commenting out:
> > >
> > > acpi_put_table(ivrs_base);
> > > ivrs_base = NULL;
> > >
> > > and the end of early_amd_iommu_init() makes the box boot again.
> >
> > So please help to comment out these 2 lines (with descriptions and do not delete them).
> > Until acpi_os_unmap_memory() is able to handle such an early case.
>
> IMO, synchronize_rcu_expedited() should be improved:
> If rcu_init() isn't called or there is nothing to synchronize, schedule_work() shouldn't be invoked.
Hmm, I think this problem has its history.
In APEI code, NMI handlers cannot utilize spinlock.
So the developers there used RCU to synchronize NMI handlers before the register region is unmapped.
At that time, there might not be pre-map/post-unmap code prepared in the APEI drivers.
So the APEI developers relied on map/unmap logics implemented in the ACPICA register read/write APIs where the OSL map/unmap is invoked.
That's why the RCU code is in acpi_os_xxx().
If:
1. there is map/unmap/read/write operations available for APEI developers to invoke;
2. RCU synchronization is invoked before invoking the last unmap operation;
3. map/unmap/read/write order is strictly ensured inside of the APEI drivers.
Then we can remove RCU stuffs from acpi_os_xxx.
And the root cause of this issue can be fixed.
I'm not sure if this approach is possible, but can give it a try.
Before that I think all such acpi_put_table() should just be commented out.
Thanks and best regards
Lv
>
> Thanks and best regards
> Lv
>
> >
> > Thanks and best regards
> > Lv
> >
> > >
> > > --
> > > Regards/Gruss,
> > > Boris.
> > >
> > > Good mailing practices for 400: avoid top-posting and trim the reply.