Re: [PATCH net v2] ipmr: Fix access to mfc_cache_list without lock held

From: Paolo Abeni
Date: Wed Nov 20 2024 - 04:55:24 EST


On 11/15/24 17:55, Paolo Abeni wrote:
> On 11/15/24 17:07, Stefan Wiehler wrote:
>>> On Fri, 15 Nov 2024 01:16:27 -0800 Breno Leitao wrote:
>>>> This one seems to be discussed in the following thread already.
>>>>
>>>> https://lore.kernel.org/all/20241017174109.85717-1-stefan.wiehler@xxxxxxxxx/
>>>
>>> That's why it rung a bell..
>>> Stefan, are you planning to continue with the series?
>>
>> Yes, sorry for the delay, went on vacation and was busy with other tasks, but
>> next week I plan to continue (i.e. refactor using refcount_t).
>
> I forgot about that series and spent a little time investigating the
> scenario.
>
> I think we don't need a refcount: the tables are freed only at netns
> cleanup time, so the netns refcount is enough to guarantee that the
> tables are not deleted when escaping the RCU section.
>
> Some debug assertions could help clarify, document and make the schema
> more robust to later change.
>
> Side note, I think we need to drop the RCU lock moved by:
>
> https://lore.kernel.org/all/20241017174109.85717-2-stefan.wiehler@xxxxxxxxx/
>
> as the seqfile core can call blocking functions - alloc(GFP_KERNEL) -
> between ->start() and ->stop().
>
> The issue is pre-existent to that patch, and even to the patch
> introducing the original RCU() - the old read_lock() created an illegal
> atomic scope - but I think we should address it while touching this code.

@Stefan: are you ok if I go ahead with this work, or do you prefer
finish it yourself?

Thanks,

Paolo