Strange Linux behaviour with blocking syscalls and stop signals+SIGCONT
From: Michael Kerrisk
Date: Mon Jul 03 2006 - 10:44:54 EST
Gidday,
[Various parties involved in past discussions on this topic CCed.]
First off, it's worth mentioning that the topic that I'll go into below
has been visited a few times before, and in particular during one of the
more recent visits, Linus's position was:
http://marc.theaimsgroup.com/?l=linux-kernel&m=104502401330898&w=2
List: linux-kernel
Subject: Re: another subtle signals issue
From: Linus Torvalds
Date: 2003-02-12 4:21:02
...
And I have multiple times said that as far as Linux is
concerned, ^Z is, has always been, and certainly for the 2.6.x
timeframe _will_ be a signal that the kernel considers "caught".
The problem is that the "2.6.x timeframe" means something different now
than it did then, and the question is when the following Linux-specific
(mis)behaviour will ever be fixed.
==
Linux exhibits a unique behaviour among Unix implementations with
respect to signals. When a program that is blocked in the middle of
certain system calls is suspended by a signal (SIGSTSTP (^Z), SIGSTOP,
SIGGTIIN, SIGTTOU) and then resumed by a SIGCONT signal, the system call
fails with the error EINTR. ***This behaviour occurs even when no
signal handler is installed for the stop or SIGCONT signals.*** (An
example program showing the behaviour for one system call is appended to
this message.)
This can cause applications that work on other Unix systems to behave
unexpectedly on Linux.
On Linux, the current (2.6.17) system calls and functions that exhibit
this behaviour are: futex(FUTEX_WAIT), epoll_wait(), poll(), read() from
an inotify file descriptor (but not read()s from any other type of file
descriptor, AFAIK), sem_wait(3) (because of futex(2)), semop(),
semtimedop(), sigtimedwait(), and sigwaitinfo().
I have never seen this behaviour on any other Unix system, and at
various times I've explicitly tested various blocking system calls on at
least the following: Solaris 8, FreeBSD 4.8, HP-UX 11, Tru64 5.1B, Irix
6.5, and Darwin 7.2.
My reading of POSIX is that POSIX only permits a system call to fail
with EINTR if a signal handler is involved (i.e., a signal is only
considered "caught" if a handler is involved, a different definition of
Linus's "caught" quoted at the start of this message). Some inquiries
on the Austin group list quite a while back:
https://www.opengroup.org/sophocles/show_archive.tpl?source=L&listname=austin-group-l&first=1&pagesize=80&searchstring=signals+and+interruption+of+system+calls&zone=G
Or: http://tinyurl.com/l5at8
Date: Fri, 13 Feb 2004 11:44:50 +0100 (MET)
From: "Michael T Kerrisk"
To: The Austin Group
Subject: Stop signals and interruption of system calls on Linux
got answers that agreed with my interpretation, including:
https://www.opengroup.org/sophocles/show_mail.tpl?CALLER=show_archive.tpl&source=L&listname=austin-group-l&id=6668
Or: http://tinyurl.com/rdd7d
Quoting Paul Eggert:
This topic came up in a POSIX.1 standards meeting on April 22, 1993,
and the consensus of that meeting also agreed with you.
> (And of course I'm still curious if any other implementation behaves
> like Linux.)
Long-ago AIX hosts had that bug as well, but they were fixed after
that POSIX.1 discussion of a decade ago. For more details about this,
please see:
David A. Willcox (Motorola MCG - Urbana)
Job Control and POSIX
<http://groups.google.com/groups?selm=1r77ojINN85n%40ftp.UU.NET>
1993-04-22
==
Given the current development model, a fix now, in kernel 2.6.x seems in
order. Probably all of the system calls listed above should be fixed so
that in this circumstance the system call is automatically restarted,
just as currently occurs for many other similar blocking system calls
(e.g., select(), pselect(), mq_send(), mq_receive(), accep(), connect(),
recv(), send(), wait(), flock(), fcntl(F_SETLKW), etc.).
In some past threads on this topic I have seen arguments about ABI
compatability advanced, but I don't believe these hold, for the
following reasons:
a) We are talking about signals involved in *interactive* job control;
therefore any ABI issues are likely to be minimal (and see "c)" below).
b) There is no consistency about which system calls show this behaviour
and which do not. For example poll() shows it, but select() does not.
Notably, ppoll() also does not have the behaviour, since it does signal
processing differently from poll()! Another notable example is read():
on an inotify file descriptor it demonstrates this behaviour, but on
other file descriptors it does not.
c) The Linux baehviour has been arbitrary across kernel versions and
system calls. In particular, the following system calls showed this
behaviour in earlier kernel versions, but then the behaviour was changed
without forewarning and (AFAIK) without subsequent complaint:
* nanosleep() in kernel 2.4 and earlier
* msgsnd() and msgrcv() in kernels before 2.6.9.
==
As far as I can see, there seems no compelling argument not to make
Linux consistent with other current and historical Unix implementations
with respect to the treatment of blocking syscalls and stop
signals+SIGCONT. Is there any reason not to fix things now?
Cheers,
Michael
PS Just for completeness, I note that this topic has been visited before:
http://marc.theaimsgroup.com/?l=linux-kernel&m=94464821126712&w=2
List: linux-kernel
Subject: SIGCONT misbehaviour in Linux
From: Eric PAIRE <eric.paire () ri ! silicomp ! fr>
Date: 1999-12-08 10:14:13
and
http://marc.theaimsgroup.com/?l=linux-kernel&m=104501574824496&w=2
List: linux-kernel
Subject: another subtle signals issue
From: Roland McGrath <roland () redhat ! com>
Date: 2003-02-12 2:06:54
==
Below is an example program demonstrating the behaviour for semop().
This is a sample run:
$ ./a.out x
<type ^Z>
[1]+ Stopped ./r x
$ fg
./a.out x
semop: Interrupted system call
The last line of output shows that the system call failed with EINTR.
$ cat restart_semop.c
/* restart_semop.c */
#define _GNU_SOURCE
#include <string.h>
#include <signal.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <sys/ipc.h>
#include <sys/sem.h>
#include <sys/wait.h>
#include <sys/types.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <errno.h>
#define errMsg(msg) do { perror(msg); } while (0)
#define errExit(msg) do { perror(msg); exit(EXIT_FAILURE); \
} while (0)
static void
catcher(int sig)
{
char msg[100];
sprintf(msg, "Caught signal %d %s\n", sig, strsignal(sig));
fprintf(stderr, "%s", msg);
} /* catcher */
int
main(int argc, char *argv[])
{
struct sigaction sa;
int semId;
struct sembuf sops[10];
sa.sa_handler = catcher;
sigemptyset(&sa.sa_mask);
/* Make system calls restartable if argc > 1 */
sa.sa_flags = (argc > 1) ? SA_RESTART : 0;
if (sigaction(SIGINT, &sa, NULL) == -1) errMsg("sigaction");
semId = semget(IPC_PRIVATE, 1, IPC_CREAT | S_IRUSR | S_IWUSR);
if (semId == -1) errExit("semget");
if (semctl(semId, 0, SETVAL, 1) == -1) errExit("semctl");
sops[0].sem_num = 0;
sops[0].sem_op = 0;
sops[0].sem_flg = 0;
for (;;)
if (semop(semId, sops, 1) == -1) errMsg("semop");
}
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/