Re: [PATCH] rcu: Narrow early boot window of illegal synchronous grace periods

From: Rafael J. Wysocki
Date: Sat Jan 14 2017 - 07:28:49 EST


On Sat, Jan 14, 2017 at 11:35 AM, Borislav Petkov <bp@xxxxxxxxx> wrote:
> On Sat, Jan 14, 2017 at 12:00:22AM -0800, Paul E. McKenney wrote:
>> It now looks like this:
>>
>> ------------------------------------------------------------------------
>>
>> Note that the code was buggy even before this commit, as it was subject
>> to failure on real-time systems that forced all expedited grace periods
>> to run as normal grace periods (for example, using the rcu_normal ksysfs
>> parameter). The callchain from the failure case is as follows:
>>
>> early_amd_iommu_init()
>> |-> acpi_put_table(ivrs_base);
>> |-> acpi_tb_put_table(table_desc);
>> |-> acpi_tb_invalidate_table(table_desc);
>> |-> acpi_tb_release_table(...)
>> |-> acpi_os_unmap_memory
>> |-> acpi_os_unmap_iomem
>> |-> acpi_os_map_cleanup
>> |-> synchronize_rcu_expedited
>>
>> The kernel showing this callchain was built with CONFIG_PREEMPT_RCU=y,
>> which caused the code to try using workqueues before they were
>> initialized, which did not go well.
>>
>> ------------------------------------------------------------------------
>>
>> Does that work?
>
> Yap, thanks.
>
>> Fair point, but this wording appears in almost all of my patches. ;-)
>
> :-)
>
>> My rationale is that it provides a clear transition from describing the
>> problem to introducing the solution.
>
> Fair enough.
>
>> Exactly, but yes, worth a comment.
>>
>> The header comment for rcu_scheduler_starting() is now as follows:
>>
>> /*
>> * During boot, we forgive RCU lockdep issues. After this function is
>> * invoked, we start taking RCU lockdep issues seriously. Note that unlike
>> * Tree RCU, Tiny RCU transitions directly from RCU_SCHEDULER_INACTIVE
>> * to RCU_SCHEDULER_RUNNING, skipping the RCU_SCHEDULER_INIT stage.
>> * The reason for this is that Tiny RCU does not need kthreads, so does
>> * not have to care about the fact that the scheduler is half-initialized
>> * at a certain phase of the boot process.
>> */
>
> Good.
>
>> I believe that this would not buy very much, but if this variable starts
>> showing up on profiles, then perhaps a jump label would be appropriate.
>> As a separate patch, though!
>
> Yeah, let's keep that opportunity in the bag, just in case.
>
>> Thank you for your review and comments!
>
> Thanks for the fix.
>
> Btw, I'll build one more test kernel for people with your final version here:
>
> https://lkml.kernel.org/r/1484383554-18095-2-git-send-email-paulmck@xxxxxxxxxxxxxxxxxx
>
> backported to 4.9.
>
> I say 4.9 because the reports started then, probably because of
>
> 8b355e3bc140 ("rcu: Drive expedited grace periods from workqueue")
>
> Which means, you probably should tag your fix CC:stable and add
>
> Fixes: 8b355e3bc140 ("rcu: Drive expedited grace periods from workqueue")
>
> to it too.

OK, so this fixes the problem with synchronize_rcu_expedited() in
acpi_os_map_cleanup(), right?

I wonder if the ACPI-specific fix is still needed, then?

Thanks,
Rafael