Re: 2.6.12-mm1 boot failure on NUMA box.

From: Nish Aravamudan
Date: Fri Jun 24 2005 - 15:13:04 EST


On 6/24/05, Ingo Molnar <mingo@xxxxxxx> wrote:
>
> * Martin J. Bligh <mbligh@xxxxxxxxxx> wrote:
>
> > > - /*
> > > - * In the NUMA case we dont use the TSC as they are not
> > > - * synchronized across all CPUs.
> > > - */
> > > -#ifndef CONFIG_NUMA
> > > - if (!use_tsc)
> > > -#endif
> > > + if (!cpu_has_tsc)
> > > /* no locking but a rare wrong value is not a big deal */
> > > return jiffies_64 * (1000000000 / HZ);
> >
> > Humpf. That does look dangerous on a NUMA-Q. The TSCs aren't synced,
> > and we can't use them .... have to use PIT, whether the CPUs have TSC
> > or not.
>
> is the only problem the unsyncedness? That should be fine as far as the
> scheduler is concerned. (we compensate for per-CPU drifts)

I'm pretty sure if the TSC gets used at all in NUMA-Q, the machine
will hang. Whenever I see that "syncronizing TSC across ## CPUs"
message at boot, I know my test is going to fail on NUMA-Q :) It is
not consistent where the hang will occur, either. Sometimes the
machine will boot but then hang in the middle of kernbench. In any
case, the solution is not to use TSC on NUMA-Q. Martin may be able to
give more technical reasons.

Thanks,
Nish
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/