Re: [x86.git#mm] stack protector fixes, vmsplice exploit

From: Ingo Molnar
Date: Thu Feb 14 2008 - 14:01:27 EST



* pageexec@xxxxxxxxxxx <pageexec@xxxxxxxxxxx> wrote:

> the commit comment says:
>
> > This hack was true up to the point the stackprotector added another
> > word to the stack frame. Shifting all the addresses by 8 bytes,
> > crashing and burning any exec attempt.
>
> actually, that's not the problem here because the canary is in the
> local variable area, it doesn't affect the passed arguments. what
> happens here is that gcc treats the argument area as owned by the
> callee, not the caller and is allowed to do certain tricks. for ssp it
> will make a copy of the struct passed by value into the local variable
> area and pass *its* address down, and it won't copy it back into the
> original instance stored in the argument area.
>
> so once sys_execve returns, the pt_regs passed by value hasn't at all
> changed and its default content will cause a nice double fault (FWIW,
> this part took me the longest to debug, being down with cold didn't
> help it either ;).

heh, indeed :-) I saw the double fault and was happy that you showed how
to fix it (it did not seem obvious to me _at all_, and i've seen my fair
share of fixes) and i quickly lured myself into having understood it :-/

I've now updated the commit message with your (correct) description.

> > Note: this means that we call __switch_to() [and its sub-functions]
> > still with the old canary, but that is not a problem, both the
> > previous and the next task has a high-quality canary. The only
> > (mostly academic) disadvantage is that the canary of one task may
> > leak onto the stack of another task, increasing the risk of
> > information leaks, were an attacker able to read the stack of
> > specific tasks (but not that of others).
>
> the best practical defense against leaking the canary is to change its
> value on every syscall but it has some performance impact in
> microbenchmarks.

yeah, that's not really practical (especially as it would deplete our
entropy pool pretty fast would could in some circumstances introduce a
higher risk to the system than the risk of a canary leaking).

I think we can avoid the leakage across tasks by being careful during
context-switch time: never calling with the old canary still in the PDA.
I think this should be fairly easy as we'd just have to load the new
pdaptr in the switch_to() assembly code.

[ the only (even more academic) possibility here would be an NMI to hit
during those two instructions that update RSP and load the new pdpptr,
and leaving the old canary on the stack. But NMIs go to separate stack
frames anyway on 64-bit so this isnt really a practical worry IMO. ]

Hm?

> > x86: stackprotector: mix TSC to the boot canary
>
> this should probably be implemented in a general get_random_long()
> function...

yeah ... this is the second time i could have used it recently.

> > x86: test the presence of the stackprotector
> > x86: streamline stackprotector
>
> the config comment says -fstack-protector whereas you're really
> enabling the stonger -fstack-protector-all feature (without the latter
> the ssp test function sample_stackprotector_fn wouldn't even be
> instrumented ;-).

yeah, indeed. I fixed the comments and flipped the commits around :-)

TODO: perhaps all vsyscall functions need to move into separate .o
files. (probably academic as the functions affected by that tend to be
very simple and do not deal with anything overflowable.)

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/