Re: I disabled more compiler warnings..
From: Paul Smith
Date: Mon May 11 2020 - 16:25:27 EST
On Mon, 2020-05-11 at 12:33 -0700, Linus Torvalds wrote:
> I wonder if you could just have three different file descriptors:
>
> - the "current token file descriptor"
> - a /dev/null file descriptor
> - the jobserver pipe file descriptor. This is left blocking.
If I'm understanding your suggestion correctly, this is pretty much how
it worked originally. Except I didn't use a /dev/null file descriptor:
I dup the jobserver FD, read() the dup, and then in the signal handler
I _closed_ the dup.
So, I can tell what happened by whether the read() returns EINTR vs.
EBADF.
This is still how things work on systems without pselect() support.
There's a blog page I wrote about this many (er... MANY) years ago:
http://make.mad-scientist.net/papers/jobserver-implementation/
You can skip the blather at the top: search down to "Then it gets
ugly..." and check the algorithm there to see if that's what you had in
mind.
The problem is that SA_RESTART works well in Linux but is not always
reliable on other OS's, and it's not always possible to catch EINTR
everywhere (for example, in third party libraries like libintl) so you
can't always automatically restart from a SIGCHLD.
See this bug reported on OS X for example:
https://savannah.gnu.org/bugs/?func=detailitem&item_id=46261
But, the issue I have now is not related to SIGCHLD, it's related to
SIGINT/SIGTERM etc. handling. Today, GNU make calls its die() function
directly from the SIGINT signal handler, and die() does A LOT of very
signal-unsafe stuff. Gross, but it's worked like that for 30 years.
My idea was that since GNU make already uses an EINTRLOOP() macro to
check EINTR returns from system calls and restart, I would have the
SIGINT handler just set a variable, then install into the EINTRLOOP
macro a check of that variable, which would let me transition from
signal handler to user-space without dropping checks all over the code
(at least, not visibly :)). Maybe I'd check directly in a few places,
where we may go for some time without a system call.
That worked but I do have to contend with the same issue as the
jobserver: any time I need to wait for something I have a race between
the last time I checked for an interrupt and the wait operation.
I have had to put this aside for a bit, so I haven't decided how to
address it but I'm sure it's doable. The issue with GNU make is it has
to be as portable as possible: make is a foundational tool for any sort
of system bootstrap. Of course, we could just not support parallel
builds on insufficient systems but...
If people feel this isn't an appropriate topic for lkml, I invite
anyone interested parties to post to bug-make@xxxxxxx :).