RE: [RFC PATCH v3 00/22] arm64: livepatch: Use ORC for dynamic frame pointer validation
From: Tomohiro Misono (Fujitsu)
Date: Fri Mar 03 2023 - 04:41:49 EST
> > <snip>
> > > Testing
> > > =======
> > >
> > > - I have run all of the livepatch selftests successfully. I have written a
> > > couple of extra selftests myself which I will be posting separately
> > Hi,
> >
> > What test configuration/environment you are using for test?
> > When I tried kselftest with fedora based config on VM, I got errors
> > because livepatch transition won't finish until signal is sent
> > (i.e. it takes 15s for every transition).
> >
> > [excerpt from test result]
> > ```
> > $ sudo ./test-livepatch.sh
> > TEST: basic function patching ... not ok
> >
> > --- expected
> > +++ result
> > @@ -2,11 +2,13 @@
> > livepatch: enabling patch 'test_klp_livepatch'
> > livepatch: 'test_klp_livepatch': initializing patching transition
> > livepatch: 'test_klp_livepatch': starting patching transition
> > +livepatch: signaling remaining tasks
> > livepatch: 'test_klp_livepatch': completing patching transition
> > ```
>
> It might be interesting to see what process is blocking the
> transition. The transition state is visible in
> /proc/<pid>/patch_state.
>
> The transition is blocked when a process is in KLP_UNPATCHED state.
> It is defined in include/linux/livepatch.h:
>
> #define KLP_UNPATCHED 0
>
> Well, the timing against the transition is important. The following
> might help to see the blocking processes:
>
> $> modprobe livepatch-sample ; \
> sleep 1; \
> for proc_path in \
> `grep "\-1" /proc/*/patch_state | cut -d '/' -f-3` ; \
> do \
> cat $proc_path/comm ; \
> cat $proc_path/stack ; \
> echo === ; \
> done
>
> After this the livepatch has to be manualy disabled and removed
>
> $> echo 0 >/sys/kernel/livepatch/livepatch_sample/enabled
> $> rmmod livepatch_sample
Thanks for the suggestion. This is quite helpful for debug.
I did some tests and in short, I could run all livepatch selftest successfully
on clang15-built kernel when RANDOMIZE_KSTACK_OFFSET=n.
Below is my analysis. Please let me know if I'm wrong.
When I checked the stack state while being live-patched, I saw some tasks
sleeping after system call are not transitioned. For example, I saw a task with
following stack:
```
sshd
[<0>] do_select+0x5cc/0x64c
[<0>] core_sys_select+0x174/0x210
[<0>] __arm64_sys_pselect6+0x11c/0x384
[<0>] invoke_syscall+0x78/0x108
[<0>] el0_svc_common+0xc0/0xfc
[<0>] do_el0_svc+0x38/0xd0
[<0>] el0_svc+0x34/0x110
[<0>] el0t_64_sync_handler+0x84/0xf0
[<0>] el0t_64_sync+0x190/0x194
```
Then, I noticed that invoke_syscall generates instructions to add random offset
in sp when RANDOMIZE_KSTACK_OFFSET=y, which is true in the above case.
Actually I see that sp can be modified in the binary:
```
$ objdump -d vmlinux --disassemble=invoke_syscall
...
ffff80000803076c <invoke_syscall>:
...
ffff8000080307b4: 9100011f mov sp, x8
...
ffff80000803085c: d65f03c0 ret
```
This will set the instruction UNRELIABLE as sp value is not deterministic:
https://github.com/madvenka786/linux/blob/orc_v3/tools/objtool/arch/arm64/decode.c#L173
and in turn will skip the generation of orc data:
https://github.com/madvenka786/linux/blob/orc_v3/tools/objtool/dcheck.c#L313
I can confirm the orc result in vmlinux:
```
./tools/objtool/objtool --dump vmlinux
...
# no entry in range of invoke_syscall (ffff80000803076c - ffff80000803085c)
ffff800008030764: cfa:sp+0 x29:cfa+0 type:call end:0
ffff800008030874: cfa:(und) x29:(und) type:call end:0
ffff800008030874: cfa:sp+0 x29:cfa+0 type:call end:0
...
```
So, when live-patch is performed, stacktrace of task containing invoke_syscall
cannot be validated in arch_stack_walk_reliable() and transition won't happen
until the fake signal is delivered (unless task's state changes).
It seems that stack validation itself works as intended.
As I said, when RANDOMIZE_KSTACK_OFFSET=n, selftests run fine.
Or am I misunderstood something completely?
Regards,
Tomohiro