Re: netlink: GPF in sock_sndtimeo
From: Cong Wang
Date: Mon Dec 12 2016 - 19:11:10 EST
On Mon, Dec 12, 2016 at 2:02 AM, Richard Guy Briggs <rgb@xxxxxxxxxx> wrote:
> On 2016-12-09 20:13, Cong Wang wrote:
>> Netlink notifier can safely be converted to blocking one, I will send
>> a patch.
>
> I had a quick look at how that might happen. The netlink notifier chain
> is atomic. Would the registered callback funciton need to spawn a
> one-time thread to avoid blocking?
It is already non-atomic now:
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=efa172f42836477bf1ac3c9a3053140df764699c
> I had a look at your patch. It looks attractively simple. The audit
> next tree has patches queued that add an audit_reset function that will
> require more work. I still see some potential gaps.
>
> - If the process messes up (or the sock lookup messes up) it is reset
> in the kauditd thread under the audit_cmd_mutex.
>
> - If the process exits normally or is replaced due to an audit_replace
> error, it is reset from audit_receive_skb under the audit_cmd_mutex.
>
> - If the process dies before the kauditd thread notices, either reap it
> via notifier callback or it needs a check on net exit to reset. This
> last one appears necessary to decrement the sock refcount so the sock
> can be released in netlink_kernel_release().
>
> If we want to be proactive and use the netlink notifier, we assume the
> overhead of adding to the netlink notifier chain and eliminate all the
> other reset calls under the kauditd thread. If we are ok being
> reactionary, then we'll at least need the net exit check on audit_sock.
>
I don't see why we need to check it in net exit if we use refcnt,
because we have two different users of audit_sock: kauditd and
netns, if both take care of refcnt properly, we don't need to worry
about who is the last, no matter what failures occur in what order.