Regarding the too-small stack problem -- that might be a problem for
others, but I've verified it isn't what is causing my crash. I've
rebuilt my gcc build directory so I can reproduce the crash, and I
can reproduce it even when logged in as root _after_ doing
"ulimit -s 65536" to increase the stack.
Eirik Fuller <eirik@netcom.com> has been exchanging email with me,
and he has tracked down what appears to be some problems on _some_
installations of Linux -- or, at least, one problem, but that
doesn't seem to afflict _my_ particular installation.
The problem he found, as the result of IMO some decent debugging work
on his part, was that somehow "struct rlimit" in his
/usr/include/sys/resource.h file was declaring its members (rm_cur and
rm_max, or whatever) as "int", while Linux (the kernel) was writing them
as "long" -- 64 bits. The upshot for him was that the call to getrlimit()
in toplev.c (the main() in cc1) was overwriting argc (with the 8MB figure ;-),
which led to problems later on.
I don't have that specific problem. My /usr/include/sys/resource.h #include's
<linux/resource.h>, which properly sets up the rlimit struct -- I've verified
this using "gcc ... -E toplev.c", because You Never Know.
But, the fact remains that I have a cc1 that not only segfaults immediately
on execution, but crashes gdb as soon as it reaches the call to malloc()
that results from the call to init_lex(). (This is _not_ the first call
to malloc()!!)
I'm too busy with g77 and other stuff to track this down, but I will make it
a high priority _if_ I run into it after I install the newest Craftworks
Linux AXP distribution (meaning it isn't fixed) and there's nobody else
looking into it. Though, Eirik is going to give it a try -- I've uploaded
the buggy cc1 for him to see if he can reproduce the problem, get any further
with his ELF-based system and hand-built ECOFF gdb (my system's ECOFF-based),
etc.
But, at least Eirik already discovered something I hadn't paid any attention
to before -- gcc's toplev.c module, which gets invoked for all the "major"
compilation phases of each language (gcc's cc1, g++'s cc1plus, g77's f771,
and so on), tries to increase the stack size from the current value to the
maximum as soon as it can.
Maybe 8MB is plenty, but if there's any concern, perhaps the default
_maximum_ stack size for Linux/AXP could be quadrupled. That way, most
programs would still get the 8MB size (which presumably yields better
performance or less resource hogging, when that's enough), while programs
that know they want more space can get it by calling setrlimit(), like gcc's
toplev.c does.
Just a thought.
In any case, the fact that my cc1 problem goes away just by changing the size
of a global array of constants suggests to me that there _is_ a hard-to-track
bug lurking in there somewhere. It might be convenient for me, and everyone,
if it disappears (e.g. once I upgrade), but I'd really rather Eirik or someone
track it down (to both the bug that crashes cc1 _and_ the bug that crashes gdb
as well) so that, at least, we can be sure it's been fixed, not just swept
under the rug by having the locations of things in memory move sufficiently
to hide it. OTOH, given that so much has changed in Linux since my system
was put together last May, it's entirely reasonable to expect that this lurking
bug lurks no longer -- has been fixed by the ongoing work. It'll be wonderful
if we get a definitive answer on this.
tq vm, (burley)