Re: 174cc7187e6f ACPICA: Tables: Back port acpi_get_table_with_size() and early_acpi_os_unmap_memory() from Linux kernel

From: Rafael J. Wysocki
Date: Mon Jan 09 2017 - 18:42:54 EST


On Tue, Jan 10, 2017 at 12:32 AM, Paul E. McKenney
<paulmck@xxxxxxxxxxxxxxxxxx> wrote:
> On Tue, Jan 10, 2017 at 12:15:01AM +0100, Borislav Petkov wrote:
>> On Mon, Jan 09, 2017 at 02:18:31PM -0800, Paul E. McKenney wrote:
>> > @@ -690,6 +690,8 @@ void synchronize_rcu_expedited(void)
>> > {
>> > struct rcu_state *rsp = rcu_state_p;
>> >
>> > + if (!rcu_scheduler_active)
>> > + return;
>> > _synchronize_rcu_expedited(rsp, sync_rcu_exp_handler);
>> > }
>> > EXPORT_SYMBOL_GPL(synchronize_rcu_expedited);
>>
>> That doesn't work and it is because of those damn what goes before what
>> boot sequence issues :-\
>>
>> We have:
>>
>> rest_init()
>> |-> rcu_scheduler_starting() ---> that sets rcu_scheduler_active = 1;
>> |-> kernel_thread(kernel_init, NULL, CLONE_FS);
>> |-> kernel_init()
>> |-> kernel_init_freeable()
>> |-> native_smp_prepare_cpus(setup_max_cpus)
>> |-> default_setup_apic_routing
>> |-> enable_IR_x2apic
>> |-> irq_remapping_prepare
>> |-> amd_iommu_prepare
>> |-> iommu_go_to_state
>> |-> acpi_put_table(ivrs_base);
>> |-> acpi_tb_put_table(table_desc);
>> |-> acpi_tb_invalidate_table(table_desc);
>> |-> acpi_tb_release_table(...)
>> |-> acpi_os_unmap_memory
>> |-> acpi_os_unmap_iomem
>> |-> acpi_os_map_cleanup
>> |-> synchronize_rcu_expedited()
>>
>> Now here we have rcu_scheduler_active already set so the test doesn't
>> hit and we hang.
>>
>> So we must do it differently.
>
> Yeah, there is a window just as the scheduler is starting where things don't
> work.
>
> We could move rcu_scheduler_starting() later, as long as there
> is no chance of preemption or context switch before it is invoked.
> Would that help in this case, or are we already context switching before
> acpi_os_map_cleanup() is invoked?

In the particular AMD IOMMU case it doesn't look like we are, but we
do in other cases.

> (If we are already context switching,
> short-circuiting synchronize_rcu_expedited() would be a bug.)

It may be easier to make the caller avoid RCU synchronization
altogether if that's not necessary and the caller should actually be
able to figure out when that's the case.

The patch from Lv at https://patchwork.kernel.org/patch/9504277/ goes
in the right direction IMO, but I'm not yet convinced that this is the
right one.

Thanks,
Rafael