Re: [PATCH 2/2] sched/core: split iowait state into two states

From: Jens Axboe
Date: Thu Feb 29 2024 - 12:46:00 EST


On 2/29/24 10:31 AM, Thomas Gleixner wrote:
> On Wed, Feb 28 2024 at 12:16, Jens Axboe wrote:
>> iowait is a bogus metric, but it's helpful in the sense that it allows
>> short waits to not enter sleep states that have a higher exit latency
>> than we would've picked for iowait'ing tasks. However, it's harmless in
>> that lots of applications and monitoring assumes that iowait is busy
>> time, or otherwise use it as a health metric. Particularly for async
>> IO it's entirely nonsensical.
>>
>> Split the iowait part into two parts - one that tracks whether we need
>> boosting for short waits, and one that says we need to account the
>> task
>
> We :)

I appreciate the commit message police :-)

I'll rewrite it.

>> +/*
>> + * Returns a token which is comprised of the two bits of iowait wait state -
>> + * one is whether we're making ourselves as in iowait for cpufreq reasons,
>> + * and the other is if the task should be accounted as such.
>> + */
>> int io_schedule_prepare(void)
>> {
>> - int old_iowait = current->in_iowait;
>> + int old_wait_flags = 0;
>> +
>> + if (current->in_iowait)
>> + old_wait_flags |= TASK_IOWAIT;
>> + if (current->in_iowait_acct)
>> + old_wait_flags |= TASK_IOWAIT_ACCT;
>>
>> current->in_iowait = 1;
>> + current->in_iowait_acct = 1;
>> blk_flush_plug(current->plug, true);
>> - return old_iowait;
>> + return old_wait_flags;
>> }
>>
>> -void io_schedule_finish(int token)
>> +void io_schedule_finish(int old_wait_flags)
>> {
>> - current->in_iowait = token;
>> + if (!(old_wait_flags & TASK_IOWAIT))
>> + current->in_iowait = 0;
>> + if (!(old_wait_flags & TASK_IOWAIT_ACCT))
>> + current->in_iowait_acct = 0;
>
> Why? TASK_IOWAIT_ACCT requires TASK_IOWAIT, right? So if TASK_IOWAIT was
> not set then TASK_IOWAIT_ACCT must have been clear too, no?

It does, IOWAIT_ACCT always nests inside IOWAIT. I guess it would be
more explanatory as:

/*
* If TASK_IOWAIT isn't set, then TASK_IOWAIT_ACCT cannot have
* been set either as it nests inside TASK_IOWAIT.
*/
if (!(old_wait_flags & TASK_IOWAIT))
current->in_iowait = 0;
else if (!(old_wait_flags & TASK_IOWAIT_ACCT))
current->in_iowait_acct = 0;

?

--
Jens Axboe