Re: correctness of BPF stack size checking logic for multi-function programs?

From: Alexei Starovoitov
Date: Thu Dec 21 2017 - 22:37:56 EST


On Fri, Dec 22, 2017 at 02:14:45AM +0100, Jann Horn wrote:
> Hi!
>
> I saw the recently-added support for multiple functions in a single
> program in BPF. I've stumbled over something that looks like it might
> be a bug; I haven't verified it yet, but I thought I should give you a
> heads-up before this lands in a release in case I'm right. If I'm
> wrong, it might be worth adding a comment to stacksafe() that explains
> why.
>
> stacksafe() has the following code:
>
> /* if explored stack has more populated slots than current stack
> * such stacks are not equivalent
> */
> if (old->allocated_stack > cur->allocated_stack)
> return false;
>
> Note that if the old state had a smaller stack than the new state,
> that is permitted because it is guaranteed that none of the extra
> space in the new state will be used.
>
> However, as far as I can tell, this can be used to smuggle a call
> chain with a total stack size bigger than the permitted maximum
> through the verifier, with code roughly as follows:
>
> void b(void) {
> <allocate 300 bytes stack>
> }
> void main(void) {
> if (<condition>) {
> <allocate 300 bytes stack>
> }
> b();
> }
>
> AFAICS, if the verifier first verifies the branch of main() where
> <condition> is false, it will go down into b, seeing a total stack
> size of around 300 bytes. Afterwards, it will verify the branch of
> main() where <condition> is true, but the states will converge after
> the branch, preventing the verifier from going down into b() again and
> discovering through update_stack_depth() that actually, the total
> stack size is around 600 bytes. From a coarse look, it seems like this
> might be usable to overflow the kernel stack, which would be
> exploitable on systems without vmapped stack?

I'm lost at part 'going down into b() again'..
b() is not going to be processed first.
The verifier always starts from main() and until we add bpf libraries
and dynamic linking the main() has to be placed first in the insn sequence,
otherwise b() (like in above) will be processed by check_cfg() first
and main() declared as dead code and whole thing rejected.
So only valid sequence is:
void main(void) {
if (<condition>) {
<allocate 300 bytes stack>
}
b();
}
void b(void) {
<allocate 300 bytes stack>
}

if the code is structured like:

void main(void) {
if (<condition>) goto alloc;
b:
b();
return;
alloc:
<allocate 300 bytes stack>
goto b;
}
void b(void) {
<allocate 300 bytes stack>
}

then in the fall-through branch the b() will be first
walked with empty caller's stack and the verifier will see 300
alloc inside 'b' and declare it as ok,
then it will reach 'exit' and go back to walk 'alloc:' branch
and will walk into b() again and now it will see 300 + 300,
so I don't think there is a bug here,
but I will rewrite a test case for it unless you beat me to it :)

Please suggest the wording for the comment and where to add it.
I think we're long overdue for the verifier doc update.
The comment at the top of verifier.c is stale.