On Thu, Nov 24, 2022 at 01:48:45AM -0800, Deepak Gupta wrote:
commit 31da94c25aea ("riscv: add VMAP_STACK overflow detection") added
support for CONFIG_VMAP_STACK. If overflow is detected, CPU switches to
`shadow_stack` temporarily before switching finally to per-cpu
`overflow_stack`.
If two CPUs/harts are racing and end up in over flowing kernel stack, one
or both will end up corrupting each other state because `shadow_stack` is
not per-cpu. This patch optimizes per-cpu overflow stack switch by
directly picking per-cpu `overflow_stack` and gets rid of `shadow_stack`.
Following are the changes in this patch
- Defines an asm macro to obtain per-cpu symbols in destination
register.
- In entry.S, when overflow is detected, per-cpu overflow stack is
located using per-cpu asm macro. Computing per-cpu symbol requires
a temporary register. x31 is saved away into CSR_SCRATCH
This only works if CSR_SCRATCH doesn't contain any valid reg saving,
but.. see below.
(CSR_SCRATCH is anyways zero since we're in kernel).
To be honest, before [1] I have similar idea to keep the percpu usage,
however, the solution doesn't work. The key here is that there's
another VMAP_STACK bug in current riscv implementation: it only checks
vmap stack overflow when comming from kernelspace, but vmap should
check when comming from both kernelspace and userspace. So we can't
assume CSR_SCRATCH is always zero and free to use. The only available
solution is my fix[1] which only makes use of tp. But since[1] modifies
lots of code, it's not idea to merge it as a fix, so [2] is suggested
and sent out.
PS: I planed to send a fix for the missing FROM_USERSPACE after the
race fix is merged.
[1]https://lore.kernel.org/linux-riscv/20220925175356.681-1-jszhang@xxxxxxxxxx/T/#t
[2]https://lore.kernel.org/linux-riscv/Y347B0x4VUNOd6V7@xhacker/T/#t