Re: system hung up when offlining CPUs

From: Thomas Gleixner
Date: Tue Oct 03 2017 - 17:45:01 EST


On Mon, 2 Oct 2017, YASUAKI ISHIMATSU wrote:
> On 09/16/2017 11:02 AM, Thomas Gleixner wrote:
> > Which driver are we talking about?
>
> We are talking about megasas driver.

Can you please apply the debug patch below.

After booting enable stack traces for the tracer:

# echo 1 >/sys/kernel/debug/tracing/options/stacktrace

Then offline CPUs 24-29. After that do

# cat /sys/kernel/debug/tracing/trace >somefile

Please compress the file and upload it to some place or if you have no place
to upload it then send it to me in private mail.

Thanks,

tglx

8<------------
--- a/kernel/irq/manage.c
+++ b/kernel/irq/manage.c
@@ -171,11 +171,16 @@ void irq_set_thread_affinity(struct irq_
int irq_do_set_affinity(struct irq_data *data, const struct cpumask *mask,
bool force)
{
+ const struct cpumask *eff = irq_data_get_effective_affinity_mask(data);
struct irq_desc *desc = irq_data_to_desc(data);
struct irq_chip *chip = irq_data_get_irq_chip(data);
int ret;

ret = chip->irq_set_affinity(data, mask, force);
+
+ trace_printk("irq: %u ret %d mask: %*pbl eff: %*pbl\n", data->irq, ret,
+ cpumask_pr_args(mask), cpumask_pr_args(eff));
+
switch (ret) {
case IRQ_SET_MASK_OK:
case IRQ_SET_MASK_OK_DONE: