Re: [PATCH 2/2] sched/wait: add wait_event_idle_exclusive_lifo()

From: NeilBrown
Date: Fri Dec 22 2017 - 18:10:20 EST

On Fri, Dec 22 2017, Peter Zijlstra wrote:

> On Fri, Dec 22, 2017 at 02:11:04PM +1100, NeilBrown wrote:
>> wait_event_*_exclusive() adds new waiters to the end of the
>> quest, while non-exclusive wait_event adds to the head.
>> This ensures that a wake_up will wake all non-exclusive
>> waiters and at most one exclusive wait, but it means that
>> exclusive waiters are woken in a FIFO order, so the task
>> woken is the one least likely to have data in the CPU cache.
>> When simple interaction with non-exclusive waiters is not
>> important, and when choosing a cache-hot task is, the new
>> wait_event_idle_exclusive_lifo()
>> and
>> wait_event_idle_exclusive_lifo_timeout()
>> can be used. To implement these we introduce a new
>> WQ_FLAG_LIFO which causes prepare_to_wait_event() to
>> add to the head of the queue.
>> This will be used to allow lustre's l_wait_event() to be
>> replaced with more standard wait.h macros.
> Urgh, so the problem with lifo is that it tends to generate starvation
> so you have to be very careful with how one uses it.

True, but given that starvation is key to the design goal here, that
doesn't seem like an argument against it. If we can starve some
processes, they get thinner (in terms of CPU cache usage) and so impose
less burden... All processes doing the exclusive_lifo wait are
effectively equal, so if several are idle it doesn't matter at all for
correctness which is woken.

This would be an argument against a wait_event* version that uses
TASK_UNINTERRUPTIBLE, but not against one that uses TASK_IDLE.

> Is there really a measurable difference if you make lustre use the
> regular fifo stuff?

The only background I know is a commit from
which says:

commit 40e312a8275ed9240e63f0ac023d8b7a38136f42
Author: Jian Yu <jian.yu@xxxxxxxxxx>
Date: Wed Dec 1 20:16:21 2010 +0800

b=23289 new API: cfs_waitq_add_exclusive_head

With this patch, we can reduce total number of active threads because
waitq is a LIFO list for exclusive waiting.

o=Liang Zhen

Maybe someone more familiar with lustre history could help??

>> Signed-off-by: NeilBrown <neilb@xxxxxxxx>
>> ---
>> include/linux/wait.h | 95 +++++++++++++++++++++++++++++++++++++++++++++++---
>> kernel/sched/wait.c | 3 +-
>> 2 files changed, 91 insertions(+), 7 deletions(-)
>> diff --git a/include/linux/wait.h b/include/linux/wait.h
>> index 3aea0780c9d0..49cb393c53d5 100644
>> --- a/include/linux/wait.h
>> +++ b/include/linux/wait.h
>> @@ -20,6 +20,9 @@ int default_wake_function(struct wait_queue_entry *wq_entry, unsigned mode, int
>> #define WQ_FLAG_EXCLUSIVE 0x01
>> #define WQ_FLAG_WOKEN 0x02
>> #define WQ_FLAG_BOOKMARK 0x04
>> +#define WQ_FLAG_LIFO 0x08 /* used with WQ_FLAG_EXCLUSIVE to force
>> + * LIFO scheduling in prepare_to_wait_event().
>> + */
> This is not an acceptable comment style.
Fixed to

* to force LIFO scheduling in prepare_to_wait_event().
#define WQ_FLAG_LIFO 0x08


Attachment: signature.asc
Description: PGP signature