Re: [RFC PATCH net] net: ipconfig: Release the rtnl_lock while waiting for carrier

From: Maxime Chevallier
Date: Thu Oct 28 2021 - 02:45:28 EST


Hello Antoine,

On Wed, 27 Oct 2021 18:05:09 +0200
Antoine Tenart <atenart@xxxxxxxxxx> wrote:

>Hi Maxime,
>
>Quoting Maxime Chevallier (2021-10-27 15:19:53)
>> While waiting for a carrier to come on one of the netdevices, some
>> devices will require to take the rtnl lock at some point to fully
>> initialize all parts of the link.
>>
>> That's the case for SFP, where the rtnl is taken when a module gets
>> detected. This prevents mounting an NFS rootfs over an SFP link.
>>
>> This means that while ipconfig waits for carriers to be detected, no SFP
>> modules can be detected in the meantime, it's only detected after
>> ipconfig times out.
>>
>> This commit releases the rtnl_lock while waiting for the carrier to come
>> up, and re-takes it to check the for the init device and carrier status.
>>
>> At that point, the rtnl_lock seems to be only protecting
>> ic_is_init_dev().
>>
>> Fixes: 73970055450e ("sfp: add SFP module support")
>
>Was this working with SFP modules before?

>From what I can tell, no. In that case, does it need a fixes tag ?
It seems the problem has always been there, and booting an nfsroot
never worked over SFP links.

>
>> diff --git a/net/ipv4/ipconfig.c b/net/ipv4/ipconfig.c
>> index 816d8aad5a68..069ae05bd0a5 100644
>> --- a/net/ipv4/ipconfig.c
>> +++ b/net/ipv4/ipconfig.c
>> @@ -278,7 +278,12 @@ static int __init ic_open_devs(void)
>> if (ic_is_init_dev(dev) && netif_carrier_ok(dev))
>> goto have_carrier;
>>
>> + /* Give a chance to do complex initialization that
>> + * would require to take the rtnl lock.
>> + */
>> + rtnl_unlock();
>> msleep(1);
>> + rtnl_lock();
>>
>> if (time_before(jiffies, next_msg))
>> continue;
>
>The rtnl lock is protecting 'for_each_netdev' and 'dev_change_flags' in
>this function. What could happen in theory is a device gets removed from
>the list or has its flags changed. I don't think that's an issue here.
>
>Instead of releasing the lock while sleeping, you could drop the lock
>before the carrier waiting loop (with a similar comment) and only
>protect the above 'for_each_netdev' loop.

Nice catch, the effect should be the same but with a much cleaner idea
of what is being protected.

I'll give it a try and respin, thanks for the review !

Maxime

>Antoine



--
Maxime Chevallier, Bootlin
Embedded Linux and kernel engineering
https://bootlin.com