Re: 2.6.12-mm1 boot failure on NUMA box.

From: Martin J. Bligh
Date: Fri Jun 24 2005 - 16:14:53 EST




--On Friday, June 24, 2005 21:52:48 +0200 Ingo Molnar <mingo@xxxxxxx> wrote:

>
> * Martin J. Bligh <mbligh@xxxxxxxxxx> wrote:
>
>> > - /*
>> > - * In the NUMA case we dont use the TSC as they are not
>> > - * synchronized across all CPUs.
>> > - */
>> > -#ifndef CONFIG_NUMA
>> > - if (!use_tsc)
>> > -#endif
>> > + if (!cpu_has_tsc)
>> > /* no locking but a rare wrong value is not a big deal */
>> > return jiffies_64 * (1000000000 / HZ);
>>
>> Humpf. That does look dangerous on a NUMA-Q. The TSCs aren't synced,
>> and we can't use them .... have to use PIT, whether the CPUs have TSC
>> or not.
>
> is the only problem the unsyncedness? That should be fine as far as the
> scheduler is concerned. (we compensate for per-CPU drifts)

Well, I think so. But I don't see how you're going to compensate for
large-scale unsynced-ness safely. I've always completely avoided the
TSC altogether on NUMA-Q ... would prefer to keep it that way. We got
lots of wierd random crashes, panics, hangs, and -ve time offsets
returned from userspace time commands last time I tried it.

M.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/