Re: 2.6.0-test3-mm1: scheduling while atomic (ext3?)

From: Alan Cox
Date: Wed Aug 13 2003 - 09:16:03 EST


On Mer, 2003-08-13 at 14:14, Andi Kleen wrote:
> hw prefetch has nothing to do with how the linux kernel uses prefetch.
> It's only using it for data structures that cannot be handled by
> the auto prefetcher.
>
> [except the broken 3dnow! copy that was never enabled]

Not broken in 2.4, although the 2.4-ac kernel uses movntq instead for
Athlon as it is faster than mmx_memcpy, which we use for Cyrix/VIA/IDT
processors where it is a win.

> > > Most Athlons in existence should have XMM already and the rest works.
> >
> > Lots don't have XMM
>
> All XPs have.

And what about all the pre MP/XP ones, lots of those.

> My AMD manual lists it as part of 3dnow. If an CPU advertises 3dnow!
> but doesn't have the instruction it's broken.

My AMD docs list it as part of the AMD extended 3dnow. The original
3dnow as done by AMD/Cyrix does not have it

> I would consider the MII broken then. setup should clear the 3dnow
> bit.

"Mummy it doesnt work like I personally have decreed it shall lets break
it and screw all the users". Thats the Dan Bernstein school of charm
theory of software development.

> there is 3dnowext, set on Athlons, but K6 has prefetchw too and
> it

3dnowext is what we want here. It might end up doing a prefetchw on
K6 but at least K6 actually has the instruction...

> But if you only want Athlon you can check for X86_FEATURE_K7.
> The problem is that it doesn't include K8 and K8
> has prefetchw too (alternative currently only allows a single
> bit, not a bitmask). Better is to either clear 3dnow on the MII
> or define a new pseudo bit that defines working and useful
> prefetchw

We want a pseudobit - otherwise we'll break other code that checks
3dnow is present properly.

> > #if defined(CONFIG_MK7)
> > alternative_input("prefetch" or "prefetchnta")
> > #else
> > alternative_input(ASM_NOP4 or "prefetchnta");
> > #endif
>
> No for weird combinations you define a new pseudo CPUID capability
> bit, check for that in the CPU detection and use that in the alternative.

Ok

> If you really want 3 way alternative you can just define a macro
> for it. The basic data structure supports it - the macro
> just needs to have two .altinstructions records and two replacement codes.
> But I have my doubt it is worth it for this case.

prefetching is a big win on older Athlon because the CPU is fast and the
chipset/ram sucks hugely relative to it

> No stinkin' ifdefs please, that would break the whole concept.

Ok

> > case at least gets us back to desirable situations. Also if I remember
> > the prefetch exception thing rightly you can misalign the prefetch
> > instruction as a workaround.
>
> Nope, no misalignment. All it does is to just handle the exception
> using __ex_table and jumps to the next instruction.

If you misalign the instruction you don't seem to get the exception on
Athlon, dunno about the Opteron errata or if the opteron errata bites in
32bit. If it does I guess we should clear mmx, xmm for Opteron by your
arguments ;)


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/