Re: BUG_ON() in workingset_node_shadows_dec() triggers

From: Willy Tarreau
Date: Fri Oct 07 2016 - 14:27:12 EST

Hi Kees,

On Fri, Oct 07, 2016 at 10:33:33AM -0700, Kees Cook wrote:
> I'll quit debating how to change things, but I'll just try to point
> out that the "stop execution" logic, currently, is not an accident.
> Without CONFIG_BUG, BUG is defined as "do {} while (1)", and without
> CONFIG_HAVE_ARCH_BUG, BUG is defined as "printk(...); panic(...);".

I think we're all convinced about this *initial* intent. However among
the 3197 BUG() and 9594 BUG_ON() that are present in v4.8, how many
should *really* be of them ? I'm seeing that during 4.8 development
cycle alone, we managed to add 81 BUG() and 55 BUG_ON(). I doubt we
found so many valid reasons to kill the system. 38 of them were added
to drivers/. The problem is that this "style" has accumulated over the
years. We only had 1739 BUG() and 1801 BUG_ON() in 2.6.12. So we
roughly multiplied that by 4 in 11 years.

The current trend seems to actually be to remove some of them, 3 were
removed from lib/, 4 from include/, 29 removed from fs/, one removed
from mm/ but two added to kernel/ and 3 other ones to net/.

Maybe changing only kernel/ and mm/'s BUG() occurrences to something
like "I_KNOW_I_WILL_BE_BLAMED_FOR_THIS_BUG()" and letting them kill
until they're properly audited, and leaving the other ones non-fatal
could be a reasonable tradeoff to start with ?