Re: [PATCH 09/10] x86-32: use SSE for atomic64_read/set if available
From: Luca Barbieri
Date: Thu Feb 18 2010 - 04:53:15 EST
> You seem to have forgotten to add benchmark results that show this is
> actually worth while? And is there really any user on 32bit
> that needs 64bit atomic_t?
perf is currently the main user.
On Core2, lock cmpxchg8b takes about 24 cycles and writes the
cacheline, while movlps takes 1 cycle.
clts/stts probably wipes out the savings if we need to use it, but we
can keep TS off and restore it lazily on return to userspace.
According to http://turkish_rational.tripod.com/trdos/pentium.txt
> I'm also suspicious of your use of global register variables.
> This means they won't be saved on entry/exit of the functions.
> Does that really work?
I think it does.
The functions never change the global register variables, and thus
they are preserved.
Calls are done in inline assembly, which saves the variables if they
are actually used as parameters (the global register variables are
only visible in a portion of the C file, of course).
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/