Re: [PATCH] net: Convert net_mutex into rw_semaphore and down read it on net->init/->exit

From: Kirill Tkhai
Date: Fri Nov 17 2017 - 13:37:42 EST


On 15.11.2017 19:31, Eric W. Biederman wrote:
> Kirill Tkhai <ktkhai@xxxxxxxxxxxxx> writes:
>
>> On 15.11.2017 12:51, Kirill Tkhai wrote:
>>> On 15.11.2017 06:19, Eric W. Biederman wrote:
>>>> Kirill Tkhai <ktkhai@xxxxxxxxxxxxx> writes:
>>>>
>>>>> On 14.11.2017 21:39, Cong Wang wrote:
>>>>>> On Tue, Nov 14, 2017 at 5:53 AM, Kirill Tkhai <ktkhai@xxxxxxxxxxxxx> wrote:
>>>>>>> @@ -406,7 +406,7 @@ struct net *copy_net_ns(unsigned long flags,
>>>>>>>
>>>>>>> get_user_ns(user_ns);
>>>>>>>
>>>>>>> - rv = mutex_lock_killable(&net_mutex);
>>>>>>> + rv = down_read_killable(&net_sem);
>>>>>>> if (rv < 0) {
>>>>>>> net_free(net);
>>>>>>> dec_net_namespaces(ucounts);
>>>>>>> @@ -421,7 +421,7 @@ struct net *copy_net_ns(unsigned long flags,
>>>>>>> list_add_tail_rcu(&net->list, &net_namespace_list);
>>>>>>> rtnl_unlock();
>>>>>>> }
>>>>>>> - mutex_unlock(&net_mutex);
>>>>>>> + up_read(&net_sem);
>>>>>>> if (rv < 0) {
>>>>>>> dec_net_namespaces(ucounts);
>>>>>>> put_user_ns(user_ns);
>>>>>>> @@ -446,7 +446,7 @@ static void cleanup_net(struct work_struct *work)
>>>>>>> list_replace_init(&cleanup_list, &net_kill_list);
>>>>>>> spin_unlock_irq(&cleanup_list_lock);
>>>>>>>
>>>>>>> - mutex_lock(&net_mutex);
>>>>>>> + down_read(&net_sem);
>>>>>>>
>>>>>>> /* Don't let anyone else find us. */
>>>>>>> rtnl_lock();
>>>>>>> @@ -486,7 +486,7 @@ static void cleanup_net(struct work_struct *work)
>>>>>>> list_for_each_entry_reverse(ops, &pernet_list, list)
>>>>>>> ops_free_list(ops, &net_exit_list);
>>>>>>>
>>>>>>> - mutex_unlock(&net_mutex);
>>>>>>> + up_read(&net_sem);
>>>>>>
>>>>>> After your patch setup_net() could run concurrently with cleanup_net(),
>>>>>> given that ops_exit_list() is called on error path of setup_net() too,
>>>>>> it means ops->exit() now could run concurrently if it doesn't have its
>>>>>> own lock. Not sure if this breaks any existing user.
>>>>>
>>>>> Yes, there will be possible concurrent ops->init() for a net namespace,
>>>>> and ops->exit() for another one. I hadn't found pernet operations, which
>>>>> have a problem with that. If they exist, they are hidden and not clear seen.
>>>>> The pernet operations in general do not touch someone else's memory.
>>>>> If suddenly there is one, KASAN should show it after a while.
>>>>
>>>> Certainly the use of hash tables shared between multiple network
>>>> namespaces would count. I don't rembmer how many of these we have but
>>>> there used to be quite a few.
>>>
>>> Could you please provide an example of hash tables, you mean?
>>
>> Ah, I see, it's dccp_hashinfo etc.

JFI, I've checked dccp_hashinfo, and it seems to be safe.

>
> The big one used to be the route cache. With resizable hash tables
> things may be getting better in that regard.

I've checked some fib-related things, and wasn't able to find that.
Excuse me, could you please clarify, if it's an assumption, or
there is exactly a problem hash table, you know? Could you please
point it me more exactly, if it's so.