Re: CONFIG_ORC_UNWINDER=y breaks get_wchan()?

From: Josh Poimboeuf
Date: Tue Sep 21 2021 - 20:17:04 EST


On Tue, Sep 21, 2021 at 12:32:49PM -0700, Vito Caputo wrote:
> Is this an oversight of the ORC_UNWINDER implementation? It's
> arguably a regression to completely break wchans for tools like `ps -o
> wchan` and `top`, or my window manager and its separate monitoring
> utility. Presumably there are other tools out there sampling wchans
> for monitoring as well, there's also an internal use of get_chan() in
> kernel/sched/fair.c for sleep profiling.
>
> I've occasionally seen when monitoring at a high sample rate (60hz) on
> something churny like a parallel kernel or systemd build, there's a
> spurious non-zero sample coming out of /proc/[pid]/wchan containing a
> hexadecimal address like 0xffffa9ebc181bcf8. This all smells broken,
> is get_wchan() occasionally spitting out random junk here kallsyms
> can't resolve, because get_chan() is completely ignorant of
> ORC_UNWINDER's effects?

Hi Vito,

Thanks for reporting this. Does this patch fix your issue?

https://lkml.kernel.org/r/20210831083625.59554-1-zhengqi.arch@xxxxxxxxxxxxx

Though, considering wchan has been silently broken for four years, I do
wonder what the impact would be if we were to just continue to show "0"
(and change frame pointers to do the same).

The kernel is much more cautious than it used to be about exposing this
type of thing. Can you elaborate on your use case?

If we do keep it, we might want to require CAP_SYS_ADMIN anyway, for
similar reasons as

f8a00cef1720 ("proc: restrict kernel stack dumps to root")

... since presumably proc_pid_wchan()'s use of '%ps' can result in an
actual address getting printed if the unwind gets confused, thanks to
__sprint_symbol()'s backup option if kallsyms_lookup_buildid() doesn't
find a name.

Though, instead of requiring CAP_SYS_ADMIN, maybe we can just fix
__sprint_symbol() to not expose addresses?

Or is there some other reason for needing CAP_SYS_ADMIN? Jann?

--
Josh