Re: A fifo and signal bug

H.J. Lu (hjl@lucon.org)
Sun, 22 Nov 1998 08:37:59 -0800 (PST)


>
> In list.linux-kernel you write:
> > hjl, please fix "sleep()" in glibc first. If it still fails for you, then
> > I can look at it, right now I can see it failing in the strace on the
> > sleep().
>
> I replaced sleep with a select(0,NULL,NULL,NULL,{WAITTIME,0}), to remove
> the possibility of it being the glibc nanosleep.
>
> I have been sticking in printk's all over the place, and I see why the
> test script is failing (at least for me -- 2.1.129 UP 486).
>
> The child enters fifo_open, and blocks. After a while, the STOP signal
> comes in, and it falls out of the system call with ERESTARTSYS. Quite
> correctly, the PIPE_READERS and PIPE_RD_OPENERS are reset back to zero.
>
> In do_signal, the signal is to be delivered to the child. The relevant
> fragment of code (line 680) is:
>
> current->state = TASK_STOPPED;
> current->exit_code = signr;
> if (!(current->p_pptr->sig->action[SIGCHLD-1].sa.sa_flags & SA_NOCLDSTOP))
> notify_parent(current, SIGCHLD);
> schedule();
> continue;
>
> The code to restart the system call on ERESTARTSYS will be entered on
> leaving the loop via the continue. _However_, the schedule does not
> return until much later (when the parent send the kill signal after
> failing to open the fifo).
>
> This appears to be a result of the first thing that schedule does -- it
> removes the task from the run queue, because it is at state TASK_STOPPED.
>
> A small change (made by somebody that may have missed the big picture
> entirely), is to only call schedule if we were not called when within
> a system call. Potentially this test would need extending so that
> additionally eax is one of the RESTART{NOHAND,SYS,NOINTR}.
>
> With this change, the open_fifo syscall gets restarted for the stopped
> child process, and the test passes.
>
> --- arch/i386/kernel/signal.c-dist Sun Nov 22 12:04:36 1998
> +++ arch/i386/kernel/signal.c Sun Nov 22 12:05:13 1998
> @@ -682,7 +682,8 @@
> current->exit_code = signr;
> if (!(current->p_pptr->sig->action[SIGCHLD-1].sa.sa_flags & SA_NOCLDSTOP))
> notify_parent(current, SIGCHLD);
> - schedule();
> + if (regs->orig_eax < 0)
> + schedule();
> continue;
>
> case SIGQUIT: case SIGILL: case SIGTRAP:
>

I tried your patch. It means I cannot no longer interrupt any blocked
system calls with SIGTSTP. I don't think it is correct. We need to
find a way to keep track who is interrupted in fifo_open and what is
is doing.

-- 
H.J. Lu (hjl@gnu.org)

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/