On 9/8/22 13:14, Ziyang Xuan (William) wrote:
Just another reference which make it clear that the reordering of function calls in your patch is likely not correct:
https://elixir.bootlin.com/linux/v5.19.7/source/net/packet/af_packet.c#L4734
static int __init packet_init(void)
{
int rc;
rc = proto_register(&packet_proto, 0);
if (rc)
goto out;
rc = sock_register(&packet_family_ops);
if (rc)
goto out_proto;
rc = register_pernet_subsys(&packet_net_ops);
if (rc)
goto out_sock;
rc = register_netdevice_notifier(&packet_netdev_notifier);
if (rc)
goto out_pernet;
return 0;
out_pernet:
unregister_pernet_subsys(&packet_net_ops);
out_sock:
sock_unregister(PF_PACKET);
out_proto:
proto_unregister(&packet_proto);
out:
return rc;
}
Yes,all these socket operations need time, most likely, register_netdevice_notifier() and register_pernet_subsys() had been done.
But it maybe not for some reasons, for example, cpu# that runs {raw,bcm}_module_init() is stuck temporary,
or pernet_ops_rwsem lock competition in register_netdevice_notifier() and register_pernet_subsys().
If the condition which I pointed happens, I think my solution can solve.