On Thu, Jan 13, 2022 at 6:57 AM cuigaosheng <cuigaosheng1@xxxxxxxxxx> wrote:
When we add "audit=1" to the cmdline, kauditd will take up 100%Thanks Gaosheng for the bug report, I'm able to reproduce this and I'm
cpu resource.As follows:
configurations:
auditctl -b 64
auditctl --backlog_wait_time 60000
auditctl -r 0
auditctl -w /root/aaa -p wrx
shell scripts:
#!/bin/bash
i=0
while [ $i -le 66 ]
do
touch /root/aaa
let i++
done
mandatory conditions:
add "audit=1" to the cmdline, and kill -19 pid_number(for /sbin/auditd).
As long as we keep the audit_hold_queue non-empty, flush the hold queue will fall into
an infinite loop.
713 static int kauditd_send_queue(struct sock *sk, u32 portid,
714 struct sk_buff_head *queue,
715 unsigned int retry_limit,
716 void (*skb_hook)(struct sk_buff *skb),
717 void (*err_hook)(struct sk_buff *skb))
718 {
719 int rc = 0;
720 struct sk_buff *skb;
721 unsigned int failed = 0;
722
723 /* NOTE: kauditd_thread takes care of all our locking, we just use
724 * the netlink info passed to us (e.g. sk and portid) */
725
726 while ((skb = skb_dequeue(queue))) {
727 /* call the skb_hook for each skb we touch */
728 if (skb_hook)
729 (*skb_hook)(skb);
730
731 /* can we send to anyone via unicast? */
732 if (!sk) {
733 if (err_hook)
734 (*err_hook)(skb);
735 continue;
736 }
737
738 retry:
739 /* grab an extra skb reference in case of error */
740 skb_get(skb);
741 rc = netlink_unicast(sk, skb, portid, 0);
742 if (rc < 0) {
743 /* send failed - try a few times unless fatal error */
744 if (++failed >= retry_limit ||
745 rc == -ECONNREFUSED || rc == -EPERM) {
746 sk = NULL;
747 if (err_hook)
748 (*err_hook)(skb);
749 if (rc == -EAGAIN)
750 rc = 0;
751 /* continue to drain the queue */
752 continue;
753 } else
754 goto retry;
755 } else {
756 /* skb sent - drop the extra reference and continue */
757 consume_skb(skb);
758 failed = 0;
759 }
760 }
761
762 return (rc >= 0 ? 0 : rc);
763 }
When kauditd attempt to flush the hold queue, the queue parameter is &audit_hold_queue,
and if netlink_unicast(line 741 ) return -EAGAIN, sk will be NULL(line 746), so err_hook(kauditd_rehold_skb)
will be call. Then continue, skb_dequeue(line 726) and err_hook(kauditd_rehold_skb,line 733) will
fall into an infinite loop.
I don't really understand the value of audit_hold_queue, can we remove it, or stop droping the logs
into kauditd_rehold_skb when the auditd is abnormal?
looking into it now. I'll report back when I have a better idea of
the problem and a potential fix.