-O3.... again
From: Timothy Miller
Date: Tue Mar 16 2004 - 11:28:12 EST
I know this has been beat to death, but I was wondering about something,
and google isn't being forthcoming.
I understand that the biggest problem with -O3 is the automatic function
inlining. It tends to make things worse, due to cache misses.
Well, the default maximum for automatic inlining for GCC (--param
max-inline-insns-auto=n) is 300 pseudo instructions, which sounds
awfully high to me (although I don't know what a pseudo instruction
quite corresponds to).
Has anyone tinkered with different values for -finline-limit, or
specifically max-inline-insns-auto to see if they could actually get any
benefit out of it?
It seems to me that as a function grows beyond a certain size, the value
of inlining it diminishes rapidly. Only when the function-call overhead
is a significant part of the over-all run-time of the rest of the
function does it really help to inline. Well, if the kernel isn't
getting benefit from the defaults for -O3, then perhaps the defaults are
wrong and need to be tweaked.
Anyone experiment with this? Any thoughts?
I doubt this would apply well right off to the kernel, but right now,
I'm doing gentoo emerges of GCC with varying CFLAGS settings. First, I
emerge GCC with the experimental values of CFLAGS. Then I change the
CFLAGS to a standard setting and time it (the timed run must always have
the same parameters for the target). My objective is to determine the
MINIMUM value of max-inline-insns-auto which yields an improvement over -O2.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/