Re: 174cc7187e6f ACPICA: Tables: Back port acpi_get_table_with_size() and early_acpi_os_unmap_memory() from Linux kernel

From: Paul E. McKenney
Date: Mon Jan 09 2017 - 19:21:39 EST


On Tue, Jan 10, 2017 at 12:42:47AM +0100, Rafael J. Wysocki wrote:
> On Tue, Jan 10, 2017 at 12:32 AM, Paul E. McKenney
> <paulmck@xxxxxxxxxxxxxxxxxx> wrote:
> > On Tue, Jan 10, 2017 at 12:15:01AM +0100, Borislav Petkov wrote:
> >> On Mon, Jan 09, 2017 at 02:18:31PM -0800, Paul E. McKenney wrote:
> >> > @@ -690,6 +690,8 @@ void synchronize_rcu_expedited(void)
> >> > {
> >> > struct rcu_state *rsp = rcu_state_p;
> >> >
> >> > + if (!rcu_scheduler_active)
> >> > + return;
> >> > _synchronize_rcu_expedited(rsp, sync_rcu_exp_handler);
> >> > }
> >> > EXPORT_SYMBOL_GPL(synchronize_rcu_expedited);
> >>
> >> That doesn't work and it is because of those damn what goes before what
> >> boot sequence issues :-\
> >>
> >> We have:
> >>
> >> rest_init()
> >> |-> rcu_scheduler_starting() ---> that sets rcu_scheduler_active = 1;
> >> |-> kernel_thread(kernel_init, NULL, CLONE_FS);
> >> |-> kernel_init()
> >> |-> kernel_init_freeable()
> >> |-> native_smp_prepare_cpus(setup_max_cpus)
> >> |-> default_setup_apic_routing
> >> |-> enable_IR_x2apic
> >> |-> irq_remapping_prepare
> >> |-> amd_iommu_prepare
> >> |-> iommu_go_to_state
> >> |-> acpi_put_table(ivrs_base);
> >> |-> acpi_tb_put_table(table_desc);
> >> |-> acpi_tb_invalidate_table(table_desc);
> >> |-> acpi_tb_release_table(...)
> >> |-> acpi_os_unmap_memory
> >> |-> acpi_os_unmap_iomem
> >> |-> acpi_os_map_cleanup
> >> |-> synchronize_rcu_expedited()
> >>
> >> Now here we have rcu_scheduler_active already set so the test doesn't
> >> hit and we hang.
> >>
> >> So we must do it differently.
> >
> > Yeah, there is a window just as the scheduler is starting where things don't
> > work.
> >
> > We could move rcu_scheduler_starting() later, as long as there
> > is no chance of preemption or context switch before it is invoked.
> > Would that help in this case, or are we already context switching before
> > acpi_os_map_cleanup() is invoked?
>
> In the particular AMD IOMMU case it doesn't look like we are, but we
> do in other cases.
>
> > (If we are already context switching,
> > short-circuiting synchronize_rcu_expedited() would be a bug.)
>
> It may be easier to make the caller avoid RCU synchronization
> altogether if that's not necessary and the caller should actually be
> able to figure out when that's the case.
>
> The patch from Lv at https://patchwork.kernel.org/patch/9504277/ goes
> in the right direction IMO, but I'm not yet convinced that this is the
> right one.

>From the RCU end, I could force expedited grace periods to translate to
normal grace periods during that window of time, and then make sure that
RCU's grace-period kthreads are spawned beforehand. Looking into this...

Thanx, Paul