Re: [PATCH net-next v2] ipv6: fix locking issues with loops over idev->addr_list

From: Niels Dossche
Date: Mon Apr 04 2022 - 09:57:25 EST


On 04/04/2022 14:47, Andrew Lunn wrote:
> On Mon, Apr 04, 2022 at 01:15:24AM +0200, Niels Dossche wrote:
>> idev->addr_list needs to be protected by idev->lock. However, it is not
>> always possible to do so while iterating and performing actions on
>> inet6_ifaddr instances. For example, multiple functions (like
>> addrconf_{join,leave}_anycast) eventually call down to other functions
>> that acquire the idev->lock. The current code temporarily unlocked the
>> idev->lock during the loops, which can cause race conditions. Moving the
>> locks up is also not an appropriate solution as the ordering of lock
>> acquisition will be inconsistent with for example mc_lock.
>
> Hi Niels
>
> What sort of issues could the race result in?

Hi Andrew

The issue is that the protection of the address list is lifted inside of the loop for a brief moment.
Therefore, the looping over the list loses its atomicity.
I believe the list's entries might become corrupted in case of a race occurring.

>
> I've been chasing a netdev reference leak, when using the GNS3
> simulator. Shutting down the system can result in one interface having
> a netdev reference count of 5, and it never gets destroyed. Using the
> tracker code Eric recently added, i found one of the leaks is idev,
> its reference count does not go to 0, and hence the reference it holds
> on the netdev is never released.>
> I will test this patch out, see if it helps, but i'm just wondering if
> you think the issue i'm seeing is theoretically possible because of
> this race? If it is, we might want this applied to stable, not just
> net-next.

I am not sure, but I believe that it may be related, although I believe it would be unlikely to happen.
In your case, it could be because of this non-atomic handling of the list entries:
this could perhaps, for example, result in skipping an instance of ifaddr in the loop if
there is another change happening to the list in the meantime. Then the instance would've never been put,
hence not changing its refcount. But again, I'm not sure about this for your case.

>
> Thanks
> Andrew

Kind regards
Niels