Re: Intel P6 vs P7 system call performance

From: Jamie Lokier (lk@tantalophile.demon.co.uk)
Date: Sat Dec 21 2002 - 21:18:24 EST


Linus Torvalds wrote:
> - eflags (kernel has no sane way to restore things like TF in it
> atomically with a sysexit)

It is better to use iret with TF. The penalty of forcing _every_
system call to pushfl and popfl in user space is quite a lot: I
measured 30 cycles for "pushfl;popfl" on my 366MHz Celeron.

("sysenter, setup segments, call a function in kernel space, restore
segments, sysexit" takes 82 cycles on the same Celeron, so 30 cycles
is quite a significant proportion to add to that. Btw, _82_, not 200
or so).

> - ebp (kernel has to reload it with arg-6)
> - ecx/edx (kernel _cannot_ restore them).

These are only needed when delivering a signal.

> Your games with looking at %eip are fragile as hell.

Like we don't play %eip games anywhere else... (the page fault fixup
table comes to mind).

> because you have a political agenda that you want to support, that
> is not really supportable.

And there was me thinking I was performance-tuning some code.
Politics, it gets everywhere, like curry gets onto anything white.

> Saving and restoring the two registers
> means that they get easier and more efficient to use from inline asms for
> example, and means that the code is simpler.

They are not more efficient from inline asms, though marginally
simpler to write (shorter clobber list). You just moved the cost from
the asm itself, where it is usually optimised away, to the trampoline
where it is always present (and cast in stone).

> Your suggestion has _zero_ advantages. Doing two register pop's takes a
> cycle, and means that the calling sequence is simple and has no special
> cases.

(Plus another cycle for the two pushes...)

> Th eexample code you posted is fragile as hell. Looking at "eip" means
> that the different system call entry points now have to be extra careful
> not to have the same return points, which is just _bad_ programming.

We are talking about a _very_ small trampoline, which is simplicity
itself compared with entry.S in general. You're right about the extra
care (so write a comment!), although it does just work for _all_ entry
points. Is this really worse than your own "wonderful hack"?

<shrug> You're the executive decision maker. I just know how to
write fast code. Thanks for listening.

-- Jamie
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Mon Dec 23 2002 - 22:00:30 EST