Re: [PATCH] abort secondary CPU bring-up gracefully if do_boot_cpu timed out on cpu_callin_mask

From: Ingo Molnar
Date: Thu Mar 06 2014 - 08:32:23 EST



* Igor Mammedov <imammedo@xxxxxxxxxx> wrote:

> On Thu, 6 Mar 2014 08:08:32 +0100
> Ingo Molnar <mingo@xxxxxxxxxx> wrote:
>
> >
> > * Igor Mammedov <imammedo@xxxxxxxxxx> wrote:
> >
> > > Master CPU may timeout before cpu_callin_mask is set and cancel
> > > booting CPU, but being onlined CPU still continues to boot, sets
> > > cpu_active_mask (CPU_STARTING notifiers) and spins in
> > > check_tsc_sync_target() for master cpu to arrive. Following attempt
> > > to online another cpu hangs in stop_machine, initiated from here:
> >
> > The changelog needs to prominently contain a description of the
> > practical relevance of this patch: has the hang triggered on any
> > system and under what circumstances, and did the patch resolve the
> > hang, etc.?
>
> Hang is observed on virtual machines during CPU hotplug, especially
> in big guests with many CPUs. (It happens more often if host is
> over-committed).
>
> Similar patch is carried in RHEL6 since 2012 and it fixes issue
> there, when applied to upstream kernel it also fixes issue.

Okay, cool - please update the patch description with that and
resubmit.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/