Ok, thanks for the confirmation. At least we know it's a CPU problem, not
a linux problem..
However, I'd really hate to have a config option for "broken CPUs". It
gets to be a maintainers nightmare, and I'd much rather see a generic
routine that is fast yet doesn't break on your cpu (if it was _just_ your
cpu I could just ignore it, but there is always the possibility that this
is a "normal" problem for some Cyrix chips).
Could you test a slightly modified 1.3.x version of the memcpy routine?
If it's a hardware register interlock problem or something like that, it
might go away with a simple re-ordering of instructions (or even just
changing one instruction into another one).
First, could you change the segment register move through %cx into a
push/pop pair instead? That would result in:
static inline void __generic_memcpy_tofs(void * to, const void * from, unsigned long n)
{
__asm__ volatile
(" cld
push %%es
push %%fs
pop %%es
cmpl $3,%0
jbe 1f
movl %%edi,%%ecx
negl %%ecx
andl $3,%%ecx
subl %%ecx,%0
rep; movsb
movl %0,%%ecx
shrl $2,%%ecx
rep; movsl
andl $3,%0
1: movl %0,%%ecx
rep; movsb
pop %%es"
:"=abd" (n)
:"0" (n),"D" ((long) to),"S" ((long) from)
:"cx","di","si");
}
(this is on the assumption that the problem is due to the segment
register stuff: that's really the only thing that makes this particular
function special in the kernel - all other accesses to user mode use the
%fs register directly).
The second thing you could try is to move the "cmpl $3,%0" one
instruction earlier (the flags will be unaffected by the "pop %es"
instruction). That would catch the case where the interlock problem is
due to back-to-back segment register accesses.
The third thing you might try is to insert a "nop" before the "rep ;
movsl", on the assumption that the interlock problem is between the
shift/movsl instruction (but that's unlikely: that particular
combination shows up even in normal code). That would be:
static inline void __generic_memcpy_tofs(void * to, const void * from, unsigned long n)
{
__asm__ volatile
(" cld
push %%es
push %%fs
cmpl $3,%0
pop %%es
jbe 1f
movl %%edi,%%ecx
negl %%ecx
andl $3,%%ecx
subl %%ecx,%0
rep; movsb
movl %0,%%ecx
shrl $2,%%ecx
nop
rep; movsl
andl $3,%0
1: movl %0,%%ecx
rep; movsb
pop %%es"
:"=abd" (n)
:"0" (n),"D" ((long) to),"S" ((long) from)
:"cx","di","si");
}
> Just to be sure that it is indeed the Cyrix chip, I'm going to have it
> replaced tomorrow.
I'd ask you to try to keep that machine alive, if only to try if there is
any alternative way of fixing it (like above). Testing it is horrible, I
know (changing asm-i386/segment.h will result in almost everything
getting recompiled, and then it probably takes at least half a day to see
if the problem is still there..), but you're the only one that sees the
problem, so..
Thanks for testing this all - I was ready to give up on that particular
machine already..
Linus