On 16/12/2024 12:09, Sabrina Dubroca wrote:
[...]
Maybe we should call cancel_sync_work(&ovpn_sock->work) inside
ovpn_socket_get()?
So the latter will return NULL only when it is sure that the socket has been
detached.
At that point we can skip the following return and continue along the "new
socket" path.
What do you think?
The work may not have been scheduled yet? (small window between the
last kref_put and schedule_work)
Maybe a completion [Documentation/scheduler/completion.rst] would
solve it (but it makes things even more complex, unfortunately):
- at the end of ovpn_socket_detach: complete(&ovpn_sock->detached);
- in ovpn_socket_new when handling EALREADY: wait_for_completion(&ovpn_sock->detached);
- in ovpn_socket_new for the new socket: init_completion(&ovpn_sock- >detached);
but ovpn_sock could be gone immediately after complete(). Maybe
something with completion_done() before the kfree_rcu in
ovpn_socket_detach? I'm not that familiar with the completion API.
It seems the solution we are aiming for is more complex than the concept of ovpn_socket per se :-D
I'll think a bit more about this..maybe we can avoid entering this situation at all..