RE: [PATCH v3 0/3] sched: Always check the integrity of the canary

From: David Laight
Date: Thu Sep 11 2014 - 12:04:30 EST


From: Aaron Tomlin
> Currently in the event of a stack overrun a call to schedule()
> does not check for this type of corruption. This corruption is
> often silent and can go unnoticed. However once the corrupted
> region is examined at a later stage, the outcome is undefined
> and often results in a sporadic page fault which cannot be
> handled.
>
> The first patch adds a canary to init_task's end of stack.
> While the second patch provides a helper to determine the
> integrity of the canary. The third checks for a stack
> overrun and takes appropriate action since the damage
> is already done, there is no point in continuing.

Clearly you've just been 'bitten' by a kernel stack overflow.
But a simple 'canary' isn't going to find most of the overflows
and will give an incorrect 'sense of security'.

The canary will only work if the stack is densely written.
In practise the stack alignment rules create gaps, and the
most likely reason for overflow is a large on-stack buffer
that isn't actually written to.

The only real way to detect kernel stack overflow is to arrange
for an unmapped page beyond the stack.
That costs KVA, but not much else.

David