Re: [PATCH v1] kthread/smpboot: Serialize kthread parking against wakeup

From: Kohli, Gaurav
Date: Tue May 01 2018 - 03:51:16 EST


sorry for spam, Adding list

On 4/30/2018 4:47 PM, Peter Zijlstra wrote:
On Thu, Apr 26, 2018 at 09:23:25PM +0530, Kohli, Gaurav wrote:
On 4/26/2018 2:27 PM, Peter Zijlstra wrote:

On Thu, Apr 26, 2018 at 10:41:31AM +0200, Peter Zijlstra wrote:
diff --git a/kernel/kthread.c b/kernel/kthread.c
index cd50e99202b0..4b6503c6a029 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -177,12 +177,13 @@ void *kthread_probe_data(struct task_struct *task)
static void __kthread_parkme(struct kthread *self)
{
- __set_current_state(TASK_PARKED);
- while (test_bit(KTHREAD_SHOULD_PARK, &self->flags)) {
+ for (;;) {
+ __set_task_state(TASK_PARKED);
set_current_state(TASK_PARKED);

of course..

Hi Peter,

Maybe i am missing something , but still that race can come as we don't put task_parked on special state.

Controller                                    Hotplug

ÂÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ Loop

ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂTask_Interruptible

Set SHOULD_PARK

wakeup -> Proceeds

ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ Â Set Running

ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ Â kthread_parkme

ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ Â SET TASK_PARKED

ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ ÂÂÂ Â schedule

Set TASK_RUNNING

Can you please correct ME, if I misunderstood this.

If that could happen, all wait-loops would be broken. However,
AFAICT that cannot happen, because ttwu_remote() and schedule()
serialize on rq->lock. See:


A B

for (;;) {
set_current_state(UNINTERRUPTIBLE);

cond1 = true;
wake_up_process(A)
lock(A->pi_lock)
smp_mb__after_spinlock()
if (A->state & TASK_NORMAL)
A->on_rq && ttwu_remote()
if (cond1) // true
break;
schedule();
}
__set_current_state(RUNNING);


Hi Peter,

Sorry for the late reply and i was on leave.

Thanks for the new patches, We will apply and test for issue reproduction.

But In our older case, where we have seen failure below is the wake up path and ftraces, Wakeup occured and completed before schedule call only.

So final state of CPUHP is running not parked. I have also pasted debug ftraces that we got during issue reproduction.

Here wakeup for cpuhp is below:

takedown_cpu-> kthread_park-> wake_up_process


39,034,311,742,395 apps (10240) Trace Printk cpuhp/0 (16) [000] 39015.625000: <debug> __kthread_parkme state=512 task=ffffffcc7458e680 flags: 0x5 -> state 5 -> state is parked inside parkme function

39,034,311,846,510 apps (10240) Trace Printk cpuhp/0 (16) [000] 39015.625000: <debug> before schedule __kthread_parkme state=0 task=ffffffcc7458e680 flags: 0xd -> just before schedule call, state is running

tatic void __kthread_parkme(struct kthread *self)

{

__set_current_state(TASK_PARKED);

while (test_bit(KTHREAD_SHOULD_PARK, &self->flags)) {

if (!test_and_set_bit(KTHREAD_IS_PARKED, &self->flags))

complete(&self->parked);

schedule();

__set_current_state(TASK_PARKED);

}

clear_bit(KTHREAD_IS_PARKED, &self->flags);

__set_current_state(TASK_RUNNING);

}

So my point is here also, if it is reschedule then it can set TASK_PARKED, but it seems after takedown_cpu call this thread never get a chance to run, So final state is TASK_RUNNING.

In our current fix also can't we observe same scenario where final state is TASK_RUNNING.

Regards

Gaurav

for (;;) {
set_current_state(UNINTERRUPTIBLE);
if (cond2)
break;

schedule();
lock(rq->lock)
smp_mb__after_spinlock();
deactivate_task(A);
<sched-out>
unlock(rq->lock);
rq = __task_rq_lock(A)
if (A->on_rq) // false
A->state = TASK_RUNNING;
__task_rq_unlock(rq)


Either A's schedule() must observe RUNNING (not shown) or B must
observe !A->on_rq (shown) and not issue the store.
--
To unsubscribe from this list: send the line "unsubscribe linux-arm-msm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html


--
Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.