Re: [BUG] 2.5.63: ESR killed my box!

From: Linus Torvalds (torvalds@transmeta.com)
Date: Wed Feb 26 2003 - 17:05:18 EST


On Wed, 26 Feb 2003, Ion Badulescu wrote:
>
> Mikael's patch (included in the previous message) changes this to
>
> boot_cpu_physical_apicid = -1U;
>
> which is the same thing indeed.

Yeah, I'd rather remove it if this is it.

> It's not enough. There are two other problems, further down in
> APIC_init_uniprocessor():
>
> 1) apic_write_around(APIC_ID, boot_cpu_physical_apicid) places the APIC
> value in the lower 8 bits of APIC_ID, when it should be in the upper 8. As
> as result, it effectively forces the APIC id to always be 0 for the boot
> CPU, which is fatal on SMP AMD boxes.

Wouldn't it be nicer to just fix the write instead? I can see the
potential to actually want to change the APIC ID - in particular, if the
SMP MP tables say that the APIC ID for the BP should be X, maybe we should
actually write X to it instead of just using what is there.

In particular, Mikaels patch will BUG() if the MP tables don't match the
APIC ID. I think that's extremely rude: we should select one of the two
and just run with it, instead of unconditionally failing.

> 2) phys_cpu_present_map = 1 means we always set bit 0, but later on
> in setup_local_APIC() we do
> if (!clustered_apic_mode &&
> !test_bit(GET_APIC_ID(apic_read(APIC_ID)), &phys_cpu_present_map))
> BUG();
> and the bug is triggered if the APIC_ID is not zero.

Yeah, there's no question something is wrong. However:

> Here's Mikael's patch again -- it's quite obviously correct, it fixes the
> problem on my SMP AMD boxes and doesn't break anything else I've thrown at
> it. Applies cleanly to both 2.4 and 2.5.latest.

I disagree with the "obviously correct", due to the above issue of
mismatches between MP tables and actual APIC contents. I think it is more
correct than what we have now, but..

                Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Fri Feb 28 2003 - 22:00:39 EST