Re: [PATCH kernel] srcu: Fix static initialization
From: Alexey Kardashevskiy
Date: Mon Sep 21 2020 - 20:41:46 EST
On 17/09/2020 02:12, Paul E. McKenney wrote:
> On Fri, Sep 11, 2020 at 06:52:08AM -0700, Paul E. McKenney wrote:
>> On Fri, Sep 11, 2020 at 03:09:41PM +1000, Alexey Kardashevskiy wrote:
>>> On 11/09/2020 04:53, Paul E. McKenney wrote:
>>>> On Wed, Sep 09, 2020 at 10:31:03PM +1000, Alexey Kardashevskiy wrote:
>
> [ . . . ]
>
>>>>> init_srcu_struct_nodes() assumes ssp->sda!=NULL but alloc_percpu() fails
>>>>> here:
>>>>>
>>>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/mm/percpu.c#n1734
>>>>> ===
>>>>> } else if (mutex_lock_killable(&pcpu_alloc_mutex)) {
>>>>> pcpu_memcg_post_alloc_hook(objcg, NULL, 0, size);
>>>>> return NULL;
>>>>> ===
>>>>>
>>>>> I am still up to reading that osr-rcuusage.pdf to provide better
>>>>> analysis :) Thanks,
>>>>
>>>> Ah, got it! Does the following patch help?
>>>>
>>>> There will likely also need to be changes to cleanup_srcu_struct(),
>>>> but first let's see if I understand the problem. ;-)
>>>>
>>>> Thanx, Paul
>>>>
>>>> ------------------------------------------------------------------------
>>>>
>>>> diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
>>>> index c13348e..6f7880a 100644
>>>> --- a/kernel/rcu/srcutree.c
>>>> +++ b/kernel/rcu/srcutree.c
>>>> @@ -177,11 +177,13 @@ static int init_srcu_struct_fields(struct srcu_struct *ssp, bool is_static)
>>>> INIT_DELAYED_WORK(&ssp->work, process_srcu);
>>>> if (!is_static)
>>>> ssp->sda = alloc_percpu(struct srcu_data);
>>>> + if (!ssp->sda)
>>>> + return -ENOMEM;
>>>> init_srcu_struct_nodes(ssp, is_static);
>>>> ssp->srcu_gp_seq_needed_exp = 0;
>>>> ssp->srcu_last_gp_end = ktime_get_mono_fast_ns();
>>>> smp_store_release(&ssp->srcu_gp_seq_needed, 0); /* Init done. */
>>>
>>> The line above confuses me a bit. What you propose returns without
>>> smp_store_release() called which should not matter I suppose.
>>
>> The idea is that if init_srcu_struct() returns -ENOMEM, the structure
>> has not been initialized and had better not be used. If the calling code
>> cannot handle that outcome, then the calling code needs to do something
>> to insulate init_srcu_struct() from signals. One thing that it could
>> do would be to invoke init_srcu_struct() from a workqueue handler and
>> wait for this handler to complete.
>>
>> Please keep in mind that there is nothing init_srcu_struct() can do
>> about this: The srcu_struct is useless unless alloc_percpu() succeeds.
>>
>> And yes, I do need to update the header comments to make this clear.
>>
>>> Otherwise it should work, although I cannot verify right now as my box
>>> went down and since it is across Pacific - it may take time to power
>>> cycle it :) Thanks,
>>
>> I know that feeling! And here is hoping that the box is out of reach
>> of the local hot spots. ;-)
>
> Just following up... Did that patch help?
Yes it did.
Tested-by: Alexey Kardashevskiy <aik@xxxxxxxxx>
>
> Thanx, Paul
>
--
Alexey