Re: SYSENTER based syscalls patch, 2.1.105, RFC

Jean Wolter (jw5@os.inf.tu-dresden.de)
18 Jun 1998 11:50:16 +0200


--Multipart_Thu_Jun_18_11:50:16_1998-1
Content-Type: text/plain; charset=US-ASCII

<mingo@hal.cobaltmicro.com> writes:

>
> [1 <text/plain; US-ASCII (7bit)>]
>
> the attached patch implements SYSENTER/SYSEXIT based system calls for P6
> type x86 CPUs that support it. (tested only on PII, - K6, PPro anyone?)

I have played around with the corresponding syscall/sysret features of
the K6 and here are some notes/experiences/code
fragments. Unfortunately I have an older version of the K6 which
implements a rather limited version of syscall/sysret.

On the K6 you have a special register called STAR_MSR, which contains
two segment selectors and an eip. One selector and the eip is used by
the syscall instruction, the other selector is used by sysret. After
entering the kernel using syscall ecx contains the return address. esp
is not changed.

Round trip times are 31 cycles (just entering the kernel and
immediately return to user level).

So we have a nice and fast way to enter the kernel but two problems:

- Binary compatibility - ecx is changed, we have to use another
register at least to transfer the value into the kernel (ebp)

- esp is not changed, it still points to the user level stack. We have
to reload it after entering the kernel. We have to keep track of the
current kernel esp, which should be simple (at least for an uni
processor Linux)

- Another problem are the different K6 revisions:

There is a big difference between the documented behavior
(21086.pdf) and the behavior of early implementations of the K6.

Documented behavior:
STAR-MSR: 63-48 SYSRET cs and ss selector base
during SYSRET this is copied to cs, ss = cs+8
STAR-MSR: 47-32 SYSCALL cs and ss selector base
during SYSCALL this is copied to cs, ss = cs+8
STAR-MSR: 31-0 Target eip address
during SYSCALL this is copied to eip

Older versions of K6:
STAR-MSR: 63-48 not writable, write operations lead to a
general protection fault
STAR-MSR: 47-32 SYSCALL cs and ss selector base
during SYSCALL this is copied to cs, ss = cs+8
during SYSRET this is copied to cs, priviledge is set to 3,
ss=cs+0x10
STAR-MSR: 31-0 Target eip address
during SYSCALL this is copied to eip

So we have a problem with our segment selectors. Maybe we can use
syscall to enter the kernel and iret to leave it (at least on those
older processor versions)

I will see if I can come up with a complete patch for the k6 over the
weekend. In the meantime here are some code fragments to show, how it
works and to allow to play around with syscall and sysret. It is a
small loadable module and a small c program using syscall and
sysret. Please try to understand what happens before using it. (And it
doesn't look very nice, it is just some code to find out, how it
works.)

Jean

PS: To verify the behavior I have forced a segmentation fault in my
small test program. The core dump shows the following segment
registers (kernel cs with user level privileges, ss=cs+0x10):

(gdb) i r
eflags 0x10246 66118
cs 0x13 19
ss 0x23 35
ds 0x2b 43

To verify, that esp isn't reloaded and ecx contains the return address
the module prints both values after entering the kernel using syscall.

hello world (esp: bffffc4c, ecx: 8048420)


--Multipart_Thu_Jun_18_11:50:16_1998-1
Content-Type: application/octet-stream
Content-Disposition: attachment; filename="k6.c"
Content-Transfer-Encoding: base64

I2luY2x1ZGUgPGxpbnV4L2ZzLmg+CiNpbmNsdWRlIDxsaW51eC9tb2R1bGUuaD4KI2luY2x1
ZGUgPGxpbnV4L3ZlcnNpb24uaD4KI2luY2x1ZGUgPGxpbnV4L21tLmg+CiNpbmNsdWRlIDxh
c20vc2VnbWVudC5oPgoKY2hhciBrZXJuZWxfdmVyc2lvbltdID0gVVRTX1JFTEVBU0U7Cgoj
ZGVmaW5lIERFQlVHCi8qICNkZWZpbmUgTUVBU1VSRSAqLwoKI2lmZGVmIERFQlVHCiNkZWZp
bmUgUFJJTlRLKCBhcmdzLi4uICkgcHJpbnRrKCBhcmdzKQojZWxzZQojZGVmaW5lIFBSSU5U
SyggYXJncy4uLiApCiNlbmRpZgoKdm9pZCBmYXN0X3N5cyh2b2lkKTsgCmFzbSgKICAgICIu
Z2xvYmwgZmFzdF9zeXM7IgogICAgImZhc3Rfc3lzOjsiCiNpZmRlZiBNRUFTVVJFCiAgICAi
LmJ5dGUgMHgwZiwgMHg3OzsiCiNlbmRpZgogICAgInB1c2hsICVlY3g7IHB1c2hsICVlc3A7
IiAKICAgICJjYWxsIHRlc3RfbXNnOyAiCiAgICAicG9wbCAlZWN4OyBwb3BsICVlY3g7Igog
ICAgIi5ieXRlIDB4MGYsIDB4Nzs7IgogICAgKTsKCmFzbWxpbmthZ2Ugdm9pZCB0ZXN0X21z
ZyhpbnQgZXNwLCBpbnQgZWN4KQp7CiAgcHJpbnRrKCJoZWxsbyB3b3JsZCAoZXNwOiAleCwg
ZWN4OiAleClcbiIsIGVzcCwgZWN4KTsKfQovKioqKioqKioqKioqKioqKioqKioqKioqKioq
KioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKi8KLyogQXV4aWxpYXJ5
IGZ1bmN0aW9uczogcmRtc3IgYW5kIHdybXNyLiBDYWxsZWQgdG8gcmVhZCBNU1IgdmFsdWVz
ICovCi8qKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioq
KioqKioqKioqKioqKioqKioqLwovKiBSZWFkcyBhbiBNU1IgYW5kIHJldHVybnMgdGhlIHZh
bHVlcyB0byB1c2VyICovCgp2b2lkCnJkbXNyKCBjb25zdCB1X2ludCBtc3IsIHVfaW50ICpo
aSwgdV9pbnQgKmxvICkKewogIF9fYXNtIF9fdm9sYXRpbGUoCgkibW92bCAlMiwgJSVlY3gg
ICAgICAgICMgTVNSIHRvIGJlIHJlYWQKICAgICAgICAuYnl0ZSAweGY7IC5ieXRlIDB4MzIg
ICMgUkRNU1IKICAgICAgICBtb3ZsICUlZWR4LCAlMCAgICAgICAgICMgaGlnaCBvcmRlciBi
aXRzCiAgICAgICAJbW92bCAlJWVheCwgJTEgICAgICAgICAjIGxvdyBvcmRlciBiaXRzIgoJ
OiAiPWciICgqaGkpLCAiPWciICgqbG8pOiAiZyIgKG1zcik6ICJlYXgiLCAiZWN4IiwgImVk
eCIgKTsKCn0gLyogZW5kLXJkbXNyICovCgovKioqKioqKioqKioqKioqKioqKioqKioqKioq
KioqKioqKioqKioqKioqKioqKiovCi8qIFdyaXRlcyB0aGUgdmFsdWVzIHRvIGdpdmVuIE1T
UiAgICAgICAgICAgICAgKi8KLyoqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioq
KioqKioqKioqKioqLwoKdm9pZCB3cm1zciggY29uc3QgdV9pbnQgbXNyLCB1X2ludCBoaSwg
dV9pbnQgbG8gKQp7CiAgX19hc20gX192b2xhdGlsZSgKCSJtb3ZsICUwLCAlJWVjeCAgICAg
ICAgIyBNU1IgdG8gYmUgd3JpdHRlbgogICAgICAgIG1vdmwgJTEsICUlZWR4ICAgICAgICAg
IyBoaWdoIG9yZGVyIGJpdHMKICAgICAgIAltb3ZsICUyLCAlJWVheCAgICAgICAgICMgbG93
IG9yZGVyIGJpdHMKICAgICAgICAuYnl0ZSAweGY7IC5ieXRlIDB4MzAgICMgV0RNU1IgIgoJ
OiA6ImciIChtc3IpLCAiZyIgKGhpKSwgImciIChsbyk6ICJlYXgiLCAiZWN4IiwgImVkeCIg
KTsKfSAvKiBlbmQtd3Jtc3IgKi8KCmV4dGVybiBpbmxpbmUgdm9pZCBjcHVpZChpbnQgb3As
IGludCAqZWF4LCBpbnQgKmVieCwgaW50ICplY3gsIGludCAqZWR4KQp7CglfX2FzbV9fKCJj
cHVpZCIKCQk6ICI9YSIgKCplYXgpLAoJCSAgIj1iIiAoKmVieCksCgkJICAiPWMiICgqZWN4
KSwKCQkgICI9ZCIgKCplZHgpCgkJOiAiYSIgKG9wKQoJCTogImNjIik7Cn0KCgovKioqKioq
KioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioKICog
Q2FsbGVkIGJ5IGluc21vZCB0byBsb2FkIHRoZSBwZW50aXVtIGNvdW50ZXIgbW9kdWxlCSoK
ICogaW50byBtZW1vcnkgYW5kIHJlZ2lzdGVyIGl0IHdpdGhpbiB0aGUga2VybmVsLgkqCiAq
KioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioq
Ki8KaW50IGluaXRfbW9kdWxlKCkgCnsKICB1bnNpZ25lZCBpbnQgIGR1bW15LCBjYXAsIGNz
LHNzOwogIHVfaW50IGhpZ2gsIGxvdzsKCiAgUFJJTlRLKCJtb2R1bGUgbG9hZGVkXG4iKTsK
CiAgLyogQ2hlY2sgZm9yIHN5c2NhbGwvc3lzcmV0IGNhcGFiaWxpdHk7IG9sZGVyIHZlcnNp
b25zIG9mIEs2IHVzZSBiaXQgMTAgdG8KICAgICBzaWduYWwgc3VwcG9ydCBmb3IgdGhpcyBm
ZWF0dXJlLiBOZXdlciBvbmUgdXNlIGJpdCAxMSAqLwogIGNwdWlkKDB4ODAwMDAwMDEsICZk
dW1teSwgJmR1bW15LCAmZHVtbXksICZjYXApOyAKICBpZiAoY2FwICYgMHg0MDApIHsKICAg
IFBSSU5USygic3lzY2FsbC9zeXNyZXQgY2FwYWJpbGl0eSBkZXRlY3RlZCAoJWx4KVxuIiwg
Y2FwKTsKICB9IGVsc2UgewogICAgUFJJTlRLKCJubyBzeXNjYWxsL3N5c3JldCBjYXBhYmls
aXR5IGRldGVjdGVkICglbHgpXG4iLCBjYXApOwogICAgcmV0dXJuIC0xOwogIH0KCiAgLyog
c2V0dXAgY29kZSBzZWdtZW50IGFuZCBlbnRyeSBhZGRyZXNzIGZvciBzeXNjYWxsCiAgICAg
VGhlcmUgaXMgYSBiaWcgZGlmZmVyZW5jZSBiZXR3ZWVuIHRoZSBkb2N1bWVudGVkIGJlaGF2
aW9yIAogICAgICgyMTA4Ni5wZGYpIGFuZCB0aGUgYmVoYXZpb3Igb2YgZWFybHkgaW1wbGVt
ZW50aW9ucyBvZiB0aGUgSzYuCgogICAgIERvY3VtZW50ZWQgYmVoYXZpb3I6CiAgICAgU1RB
Ui1NU1I6IDYzLTQ4IFNZU1JFVCBjcyBhbmQgc3Mgc2VsZWN0b3IgYmFzZQogICAgICAgIGR1
cmluZyBTWVNSRVQgdGhpcyBpcyBjb3BpZWQgdG8gY3MsIHNzID0gY3MrOAogICAgIFNUQVIt
TVNSOiA0Ny0zMiBTWVNDQUxMIGNzIGFuZCBzcyBzZWxlY3RvciBiYXNlCiAgICAgICAgZHVy
aW5nIFNZU0NBTEwgdGhpcyBpcyBjb3BpZWQgdG8gY3MsIHNzID0gY3MrOAogICAgIFNUQVIt
TVNSOiAzMS0wIFRhcmdldCBlaXAgYWRkcmVzcwogICAgICAgIGR1cmluZyBTWVNDQUxMIHRo
aXMgaXMgY29waWVkIHRvIGVpcAoKICAgICBPbGRlciB2ZXJzaW9ucyBvZiBLNjoKICAgICBT
VEFSLU1TUjogNjMtNDggbm90IHdyaXRhYmxlLCB3cml0ZSBvcGVyYXRpb25zIGxlYWQgdG8g
YSAKICAgICAgICBnZW5lcmFsIHByb3RlY3Rpb24gZmF1bHQKICAgICBTVEFSLU1TUjogNDct
MzIgU1lTQ0FMTCBjcyBhbmQgc3Mgc2VsZWN0b3IgYmFzZQogICAgICAgIGR1cmluZyBTWVND
QUxMIHRoaXMgaXMgY29waWVkIHRvIGNzLCBzcyA9IGNzKzgKCWR1cmluZyBTWVNSRVQgdGhp
cyBpcyBjb3BpZWQgdG8gY3MsIHByaXZpbGVkZ2UgaXMgc2V0IHRvIDMsCglzcz1jcysweDEw
CiAgICAgU1RBUi1NU1I6IDMxLTAgVGFyZ2V0IGVpcCBhZGRyZXNzCiAgICAgICAgZHVyaW5n
IFNZU0NBTEwgdGhpcyBpcyBjb3BpZWQgdG8gZWlwCiAgICAgKi8KCiAgUFJJTlRLKCJ3cml0
aW5nIHN0YXIgcmVnaXN0ZXJcbiIpOwogIHdybXNyKDB4YzAwMDAwODEsIF9fS0VSTkVMX0NT
LCAodV9pbnQpZmFzdF9zeXMpOwoKICAvKiBzZXQgZmlyc3QgYml0IGluIGV4dGVuZGVkIGZl
YXR1cmUgcmVnaXN0ZXIgMHhjMDAwMDA4MCB0byBlbmFibGUKICAgICBzeXNjYWxsL3N5c3Jl
dCAqLwogIFBSSU5USygiRW5hYmxpbmcgc3VwcG9ydCBmb3Igc3lzY2FsbC9zeXNyZXRcbiIp
OwogIHJkbXNyKDB4YzAwMDAwODAsICZoaWdoLCAmbG93KTsKICB3cm1zcigweGMwMDAwMDgw
LCBoaWdoLCBsb3cgfCAxTCk7CgogIHJldHVybiAwOwp9IC8qIGluaXRfbW9kdWxlICovCgov
KioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioq
KioKICogQ2FsbGVkIGJ5IHJtbW9kIGF0IHRoZSBlbmQgdG8gcmVtb3ZlIHRoZSBwZW50aXVt
CSoKICogZGV2aWNlIGRyaXZlciBmcm9tIGtlcm5lbCAJCQkJKgogKioqKioqKioqKioqKioq
KioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKiovCnZvaWQgY2xlYW51
cF9tb2R1bGUodm9pZCkKewogIHVfaW50IGhpZ2gsIGxvdzsKCiAgUFJJTlRLKCJEaXNhYmxp
bmcgc3VwcG9ydCBmb3Igc3lzY2FsbC9zeXNyZXRcbiIpOwogIHJkbXNyKDB4YzAwMDAwODAs
ICZoaWdoLCAmbG93KTsKICB3cm1zcigweGMwMDAwMDgwLCBoaWdoLCBsb3cgJiB+MUwpOwog
IHdybXNyKDB4YzAwMDAwODEsIDAsIDApOwp9IC8qIGVuZC1jbGVhbnVwX21vZHVsZSAqLwo=

--Multipart_Thu_Jun_18_11:50:16_1998-1
Content-Type: application/octet-stream
Content-Disposition: attachment; filename="fastcall.c"
Content-Transfer-Encoding: base64

I2luY2x1ZGUgPHN0ZGlvLmg+CnZvaWQgbWFpbih2b2lkKQp7CiAgYXNtKAogICAgICAiCm1v
dmwgJDEwMDAwLCAlZWJ4OwpyZHRzYzsKbW92bCAlZWF4LCAlZWRpOwptb3ZsICVlZHgsICVl
c2k7CjE6Ci5ieXRlIDB4MGYsIDB4NTsgCmRlY2wgJWVieDsKam56IDFiOwpyZHRzYzsgCnN1
YmwgJWVkaSwgJWVheDsgCnN1YmwgJWVzaSwgJWVkeDsgCm1vdmwgJDAsIDAiKTsKfQo=

--Multipart_Thu_Jun_18_11:50:16_1998-1
Content-Type: application/octet-stream
Content-Disposition: attachment; filename="Makefile"
Content-Transfer-Encoding: base64

YWxsOiBrNi5vCgprNi5vOglrNi5jCglnY2MgLWcgLWMgLWZuby1zdHJlbmd0aC1yZWR1Y2Ug
LUkgPGluc2VydCB5b3VyIDIuMSBpbmNsdWRlIHBhdGg+IFwKCS1EX19MSU5VWF9fIC1ETElO
VVggLURfX0tFUk5FTF9fICAtTzIgLW8gJEAgJF4KCmZhc3RjYWxsOiBmYXN0Y2FsbC5jCgln
Y2MgLWcgLW8gJEAgJF4KY2xlYW46CglybSAtZiAqLm8K

--Multipart_Thu_Jun_18_11:50:16_1998-1
Content-Type: text/plain; charset=US-ASCII

-- 
I get up each morning, gather my wits.
Pick up the paper, read the obits.
if I'm not there I know I'm not dead.
So I eat a good breakfast and go back to bed. Peete Seeger

--Multipart_Thu_Jun_18_11:50:16_1998-1--

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu