On Thu, Nov 21, 2002 at 09:21:47AM -0800, Andrew Morton wrote:
> > Good eyes. But.. this also applies to 2.4 (which should also then
> > get faster). So the gap between 2.4 & 2.5 must be somewhere else ?
>
> But 2.4 already inlines the usercopy functions. With this benchmark,
> the cost of the function call is visible. Same with the dir_rtn_1
> test - it is performing zillions of 3, 7, 10-byte copies into userspace.
But the reduction of number of copy_*_user's applies to 2.4 too right ?
> We'd buy a bit by arranging for the in-kernel copy of the fp state
> to have the same layout as the hardware. That way it can be done in
> a single big, fast, well-aligned slurp. But for some reason that code has
> to convert into and out of a different representation.
Possibly hardware restrictions, I'm unfamiliar with how this voodoo works.
> But the real low-hanging fruit here is the observation that the
> test application doesn't use floating point!!!
Interesting point. What was the test app ? contest ?
(I missed the beginning of this thread).
> Maybe we need to take an fp trap now and then to "poll" the application
> to see if it is still using float.
Neat idea. Bonus points for making it work 8-)
Dave
-- | Dave Jones. http://www.codemonkey.org.uk | SuSE Labs - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
This archive was generated by hypermail 2b29 : Sat Nov 23 2002 - 22:00:37 EST