Re: NOHZ: WARNING: at arch/x86/kernel/smp.c:123native_smp_send_reschedule, round 2

From: Borislav Petkov
Date: Tue May 21 2013 - 03:21:48 EST


On Tue, May 21, 2013 at 10:20:51AM +0800, Michael Wang wrote:
> This is not enough to prove that policy->cpus is wrong, the cpu could
> be online when get from policy->cpus, but offline when checked here,
> since hotplug is able to happen during the period.

Strictly speaking you're correct but I don't do any hotplug besides the
one-time thing which is part of halting the box.

> I don't get it...
>
> get_online_cpus() is just stop hotplug happen after it was invoked, so
> unless policy->cpus is really wrong, otherwise all the cpu it masked
> won't go offline any more.

Yes, that's my impression too - at the point we do gov_queue_work,
policy->cpus already contains offline cpus.

> This protect nothing...before we go here, the cpu could already
> offline, nothing changed...

Yes, but I don't want to schedule work on an offlined cpu and that is
ensured here.

> If you really want to confirm the policy->cpus was wrong, the way
> should be apply the fix I suggested, than check online in here.

Sure, feel free to get a box, enable NO_HZ_FULL and do all the
experimentations you desire. I surely cannot be the only one who
triggers this.

--
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/