RE: [PATCH 1/3] sched: define a function to report the number of context switches on a CPU
From: Long Li
Date: Wed Aug 21 2019 - 04:20:53 EST
>>>Subject: Re: [PATCH 1/3] sched: define a function to report the number of
>>>context switches on a CPU
>>>
>>>On Mon, Aug 19, 2019 at 11:14:27PM -0700, longli@xxxxxxxxxxxxxxxxx
>>>wrote:
>>>> From: Long Li <longli@xxxxxxxxxxxxx>
>>>>
>>>> The number of context switches on a CPU is useful to determine how
>>>> busy this CPU is on processing IRQs. Export this information so it can
>>>> be used by device drivers.
>>>
>>>Please do explain that; because I'm not seeing how number of switches
>>>relates to processing IRQs _at_all_!
Some kernel components rely on context switch to progress, for example watchdog and RCU. On a CPU with reasonable interrupt load, it continues to make context switches, normally a number of switches per seconds.
While observing a CPU with heavy interrupt loads, I see that it spends all its time in IRQ and softIRQ, and not to get a chance to do a switch (calling __schedule()) for a long time. This will result in system unresponsive at times. The purpose is to find out if this CPU is in this state, and implement some throttling mechanism to help reduce the number of interrupts. I think the number of switches is not accurate for detecting this condition in the most precise way, but maybe it's good enough.
I agree this may not be the best way. If you have other idea on detecting a CPU is swamped by interrupts, please point me to where to look at.
Thanks
Long