Re: [RFC PATCH v2] sched_pair_cpu: Introduce scheduler task pairing system call

From: Mathieu Desnoyers
Date: Fri Jun 26 2020 - 11:16:21 EST


----- On Jun 25, 2020, at 12:34 PM, Mathieu Desnoyers mathieu.desnoyers@xxxxxxxxxxxx wrote:

> ----- On Jun 25, 2020, at 10:56 AM, Mathieu Desnoyers
> mathieu.desnoyers@xxxxxxxxxxxx wrote:
>
>> ----- On Jun 24, 2020, at 3:50 PM, Peter Zijlstra peterz@xxxxxxxxxxxxx wrote:
>>
>>> On Wed, Jun 24, 2020 at 02:31:33PM -0400, Mathieu Desnoyers wrote:
>>>
>> [...]
>>> The other alternative is using a preempt_notifier for the worker I
>>> suppose.
>>
> [...]
>>>
>>> preempt_notifier could work here too I suppose, install it on yourself
>>> when you do the pear syscall and take it away again when you're finished
>>> with it.
>
> The issue I currently have with preempt notifiers is that I need to
> send an IPI from a sched_out notifier, which has interrupts off and
> hold the rq lock. smp_call_function_single() warns due to irq off, and
> indeed it triggers deadlocks.
>
> Before using preempt notifiers, I was touching the "prev" task after
> irqs were reenabled and rq lock was released, which allowed me to
> send an IPI from that context.
>
> Any thoughts on how to best solve this ?

I think I may have found a way out of this: I may not need to use
smp_call_function_single() at all.

When preempting a paired task, I think we can rely on memory barrier at the
beginning of scheduling of the paired task to match the memory barrier at
the end of scheduling of the kworker thread to provide memory ordering. Therefore,
the IPI is not needed at all in this case.

When preempting the kworker thread, things are a bit trickier. AFAIU I can simply
queue task work on the paired task directly without an IPI, and then use
kick_process() on the paired task.

The remaining concern is whether kick_process() (and thus smp_send_reschedule())
is sufficient to guarantee a memory barrier before smp_send_reschedule returns ?
I suspect not, because it only raises the IPI, and does not appear to wait for
its handler to complete. In that case I need a release on the paired task and
an acquire in sched_out of the kworker. The memory barrier at the end of schedule
fulfills the acquire, but I don't see how the acquire is done on the paired task,
because execution of its scheduler does not necessarily happen immediately when
the IPI is raised.

Thanks,

Mathieu

--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com