On 02/19/2017 05:46 PM, Guenter Roeck wrote:
Cc: Wolfram for input.
On 02/17/2017 10:25 AM, Niklas Cassel wrote:
From: Niklas Cassel <niklas.cassel@xxxxxxxx>
Checking for timer expiration is done from the softirq TIMER_SOFTIRQ.
Since commit 4cd13c21b207 ("softirq: Let ksoftirqd do its job"),
pending softirqs are no longer always handled immediately, instead,
if there are pending softirqs, and ksoftirqd is in state TASK_RUNNING,
the handling of the softirqs are deferred, and are instead supposed
to be handled by ksoftirqd, when ksoftirqd gets scheduled.
If a user space process with a real-time policy starts to misbehave
by never relinquishing the CPU while ksoftirqd is in state TASK_RUNNING,
what will happen is that all softirqs will get deferred, while ksoftirqd,
which is supposed to handle the deferred softirqs, will never get to run.
To make sure that the watchdog is able to fire even when we do not get
to run softirqs, replace the timers with hrtimers.
This makes the driver dependent on HIGH_RES_TIMERS, which is not available
on all architectures. Before adding that restriction, I would like to see
some discussion if this is the only feasible solution.
Is this driver the only one with this problem, or is anything using
timers affected ?
Anything using timers is affected.
The timers will still get incremented, but the code checking for timer
expiration is run from a softirq, which in this case never gets to run,
so the timers will never expire.
Before 4cd13c21b207 ("softirq: Let ksoftirqd do its job"), softirqs
were never deferred, so they always got to run when exiting an irq.
So previously with a user space process using all the CPU, like:
chrt -r 99 sh -c "while :; do :; done"
the softdog would still fire.
If we ask the system to run something all the time,
and the system does that, I don't think we can blame the system.
It is however important that the watchdog can still detect and
fire when this happens. Other drivers, not so much.
I guess another solution would be to modify the if-statements in
kernel/softirq.c to sometimes do the softirq directly, even if ksoftirqd
is in state TASK_RUNNING, if we also meet some other condition.
However, do we want to add that extra complexity?