Re: [PATCH] watchdog: Avoid 100% CPU usage during reading watchdog when a task get signal

From: Corey Minyard
Date: Thu May 18 2023 - 11:49:15 EST


On Mon, May 15, 2023 at 05:19:41AM -0700, Yu Chen wrote:
> A simple reproducer demonstrating the problem: (use ipmi_watchdog.ko)
>
> In one terminal:
>
> $ cat /dev/watchdog
> ...
>
> In another terminal:
>
> $ ps -aux | grep cat
> 14755 pts/1 R+ 43:00 cat /dev/watchdog
> 51943 pts/2 S+ 0:00 grep --color=auto cat
>
> $ kill -9 14755
> $
> $ cat /proc/14755/status | grep SigPnd
> SigPnd: 0000000000000100
> $
> $ top
>
> Tasks: 1049 total, 2 running, 1047 sleeping, 0 stopped, 0 zombie
> %Cpu(s): 0.0 us, 1.0 sy, 0.0 ni, 98.9 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
> MiB Mem : 522594.8 total, 517241.4 free, 2922.1 used, 2431.2 buff/cache
> MiB Swap: 0.0 total, 0.0 free, 0.0 used. 516589.2 avail Mem
>
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> 14755 root 20 0 215552 1024 576 R 100.0 0.0 0:15.12 cat
> 53417 root 20 0 224960 7040 3648 R 0.7 0.0 0:00.10 top
> 11 root 20 0 0 0 0 I 0.3 0.0 0:02.85 rcu_sched
> 1772 root 20 0 512256 387776 380800 S 0.3 0.1 0:32.05 python
>
> We can see that when the cat process gets the signal, the CPU usage
> is 100%, Since signal_pending is true, the pick_next_task function
> in schedule always returns itself, it retries schedule indefinitely.
> ipmi_read() will busyloop.
>
> Signed-off-by: Yu Chen <chen.yu@xxxxxxxxxxxx>
> ---
> drivers/char/ipmi/ipmi_watchdog.c | 9 +++++----
> 1 file changed, 5 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/char/ipmi/ipmi_watchdog.c b/drivers/char/ipmi/ipmi_watchdog.c
> index 0d4a8dcac..173ed4266 100644
> --- a/drivers/char/ipmi/ipmi_watchdog.c
> +++ b/drivers/char/ipmi/ipmi_watchdog.c
> @@ -803,6 +803,11 @@ static ssize_t ipmi_read(struct file *file,
> init_waitqueue_entry(&wait, current);
> add_wait_queue(&read_q, &wait);
> while (!data_to_read) {
> + if (signal_pending(current)) {
> + remove_wait_queue(&read_q, &wait);
> + rv = -ERESTARTSYS;
> + goto out;
> + }

This skips the call to remove_from_wait_queue(), which is bad. I
already have a fix for this from someone else.

-corey

> set_current_state(TASK_INTERRUPTIBLE);
> spin_unlock_irq(&ipmi_read_lock);
> schedule();
> @@ -810,10 +815,6 @@ static ssize_t ipmi_read(struct file *file,
> }
> remove_wait_queue(&read_q, &wait);
>
> - if (signal_pending(current)) {
> - rv = -ERESTARTSYS;
> - goto out;
> - }
> }
> data_to_read = 0;
>
> --
> 2.27.0
>