Re: [PATCH] stop_machine: Remove cpu swap from stop_two_cpus
From: Sodagudi Prasad
Date: Wed Jun 27 2018 - 10:07:58 EST
On 2018-06-27 00:15, Sebastian Andrzej Siewior wrote:
On 2018-06-26 14:28:26 [-0700], Isaac J. Manjarres wrote:
Remove CPU ID swapping in stop_two_cpus() so that the
source CPU's stopper thread is added to the wake queue last,
so that the source CPU's stopper thread is woken up last,
ensuring that all other threads that it depends on are woken
up before it runs.
You can't do that because you could deadlock while locking the stoper
lock.
<Prasad> Without this change boot up issues are observed with Linux
4.14.52.
One of the core is executing the stopper thread after wake_up_q() in
cpu_stop_queue_two_works() function, without waking up other cores
stopper thread.
We see this issue 100% on device boot up with Linux 4.14.52.
Could you please explain bit more how the deadlock occurs?
static int cpu_stop_queue_two_works(int cpu1, struct cpu_stop_work
*work1,
int cpu2, struct cpu_stop_work
*work2)
{
struct cpu_stopper *stopper1 = per_cpu_ptr(&cpu_stopper, cpu1);
struct cpu_stopper *stopper2 = per_cpu_ptr(&cpu_stopper, cpu2);
DEFINE_WAKE_Q(wakeq);
int err;
retry:
raw_spin_lock_irq(&stopper1->lock);
raw_spin_lock_nested(&stopper2->lock, SINGLE_DEPTH_NESTING);
<SNIP>
I think, you are suggesting to switch the locking sequence too.
stopper2->lock and stopper1->lock.
could you please share the test case to stress this code flow?
Couldn't you swap cpu1+cpu2 and work1+work2?
<Prasad> Work1 and work2 are having same data contents.
diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
index f89014a..d10d633 100644
--- a/kernel/stop_machine.c
+++ b/kernel/stop_machine.c
@@ -307,8 +307,6 @@ int stop_two_cpus(unsigned int cpu1, unsigned int
cpu2, cpu_stop_fn_t fn, void *
cpu_stop_init_done(&done, 2);
set_state(&msdata, MULTI_STOP_PREPARE);
- if (cpu1 > cpu2)
- swap(cpu1, cpu2);
if (cpu_stop_queue_two_works(cpu1, &work1, cpu2, &work2))
return -ENOENT;
Sebastian
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora
Forum,
Linux Foundation Collaborative Project