Re: Need help in understanding x86 syscall
From: Steven Rostedt
Date: Thu Aug 11 2005 - 08:55:13 EST
On Wed, 2005-08-10 at 22:39 -0700, Ukil a wrote:
> I had this question. As per my understanding, in the
> Linux system call implementation on x86 architecture
> the call flows like this int 0x80 -> syscall ->
> sys_call_vector(taken from the table)-> return from
> interrupt service routine.
>
> Now I had the doubt that if the the syscall
> implementation is very large will the scheduling and
> other interrupts be blocked for the whole time till
> the process returns from the ISR (because in an ISR by
> default the interrupts are disabled unless “sti” is
> called explicitly)? That’s appears to be too long for
> the scheduling or other interrupts to be blocked?
> Am I missing something here?
This is where interrupt is not a good term for syscall. It is really a
trap. An interrupt is an outside source that stops the CPU from doing
what it was doing to go do something else (asynchronous event). A trap
is something that the CPU calls on itself to do something else
(synchronous event). So when a network packet comes in, the NIC sends
an interrupt request (request since the CPU may not immediately handle
it), and when the CPU is ready (interrupts on) it will stop what it is
doing and handle the network packet.
When you do a system call, it is a trap. Quoting the Intel manual:
The difference between an interrupt gate and a trap gate is as follows. If an interrupt or exception
handler is called through an interrupt gate, the processor clears the interrupt enable (IF) flag in
the EFLAGS register to prevent subsequent interrupts from interfering with the execution of the
handler. When a handler is called through a trap gate, the state of the IF flag is not changed.
So when you go into the kernel through a trap (system call) interrupts
are still on if they were on to begin with, which is the case when
coming from user mode. But when you come into the kernel through a real
interrupt, then interrupts are off.
Also note that if you have preemption disabled, the system call will
_not_ do any scheduling unless it explicitly calls schedule. If you
turn on voluntary preempt, it will call schedule at various spots in the
kernel that might call schedule anyway. If you turn on full preemption,
then the process can be scheduled out anywhere in the kernel unless it
explicitly says it doesn't want to (preempt_disable or spin_lock).
-- Steve
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/