Re: [PATCH] abort secondary CPU bring-up gracefully if do_boot_cpu timed out on cpu_callin_mask

From: Igor Mammedov
Date: Thu Mar 06 2014 - 05:21:56 EST


On Thu, 6 Mar 2014 08:08:32 +0100
Ingo Molnar <mingo@xxxxxxxxxx> wrote:

>
> * Igor Mammedov <imammedo@xxxxxxxxxx> wrote:
>
> > Master CPU may timeout before cpu_callin_mask is set and cancel
> > booting CPU, but being onlined CPU still continues to boot, sets
> > cpu_active_mask (CPU_STARTING notifiers) and spins in
> > check_tsc_sync_target() for master cpu to arrive. Following attempt
> > to online another cpu hangs in stop_machine, initiated from here:
>
> The changelog needs to prominently contain a description of the
> practical relevance of this patch: has the hang triggered on any
> system and under what circumstances, and did the patch resolve the
> hang, etc.?

Hang is observed on virtual machines during CPU hotplug,
especially in big guests with many CPUs. (It happens more
often if host is over-committed).

Similar patch is carried in RHEL6 since 2012 and it fixes
issue there, when applied to upstream kernel it also fixes
issue.

>
> Thanks,
>
> Ingo

Thanks,
Igor.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/