Re: [PATCH v2 1/2] livepatch: Remove duplicate warning about missing reliable stacktrace support

From: Josh Poimboeuf
Date: Tue May 07 2019 - 17:25:27 EST


On Tue, May 07, 2019 at 04:28:47PM +0200, Petr Mladek wrote:
> > > > > Also this check is effectively the same as the klp_have_reliable_stack()
> > > > > check which is done in kernel/livepatch/core.c. So I think it would be
> > > > > clearer and more consistent if the same check is done here:
> > > > >
> > > > > if (!klp_have_reliable_stack())
> > > > > return -ENOSYS;
> > >
> > > Huh, it smells with over engineering to me.
> >
> > How so? It makes the code more readable and the generated code should
> > be much better because it becomes a build-time check.
>
> save_stack_trace_tsk_reliable() returns various error codes.
> We catch a specific one because otherwise the message below
> might be misleading.
>
> I do not see why we should prevent this error by calling
> a custom hack: klp_have_reliable_stack()?

I wouldn't call it a hack. It's a simple build-time check.

> Regarding reliability. If anyone changes semantic of
> save_stack_trace_tsk_reliable() error codes, they would likely
> check if all users (one at the moment) handle it correctly.
>
> On the other hand, the dependency between the -ENOSYS
> return value and klp_have_reliable_stack() is far from
> obvious.

I don't follow your point.

> If we want to discuss and fix this to the death. We should change
> the return value from -ENOSYS to -EOPNOTSUPP. The reason
> is the same as in the commit 375bfca3459db1c5596
> ("livepatch: core: Return EOPNOTSUPP instead of ENOSYS").
>
> Note that EOPNOTSUPP is the same errno as ENOTSUP, see
> man 3 errno.

Sure, but that's a separate issue.

> > But I think Miroslav's suggestion to revert 1d98a69e5cef would be even
> > better.
>
> AFAIK, Miroslav wanted to point out that your opinion was inconsistent.

How is my opinion inconsistent?

Is there something wrong with the original approach, which was reverted
with

1d98a69e5cef ("livepatch: Remove reliable stacktrace check in klp_try_switch_task()")

?

As I mentioned, that has some advantages:

- The generated code is simpler/faster because it uses a build-time
check.

- The code is more readable IMO. Especially if the check is done higher
up the call stack by reverting 1d98a69e5cef. That way the arch
support short-circuiting is done in the logical place, before doing
any more unnecessary work. It's faster, but also, more importantly,
it's less surprising.

- It's also more consistent with other code, since the intent of this
check is the same as the klp_have_reliable_stack() use in
klp_enabled_patch().

If you disagree with those, please explain why.

> > > > > ret = save_stack_trace_tsk_reliable(task, &trace);
> > > > >
> > > > > [ no need to check ret for ENOSYS here ]
> > > > >
> > > > > Then, IMO, no comment is needed.
> > > >
> > > > BTW, if you agree with this approach then we can leave the
> > > > WARN_ON_ONCE() in save_stack_trace_tsk_reliable() after all.
> > >
> > > I really like the removal of the WARN_ON_ONCE(). I consider
> > > it an old fashioned way used when people are lazy to handle
> > > errors. It might make sense when the backtrace helps to locate
> > > the context but the context is well known here. Finally,
> > > WARN() should be used with care. It might cause reboot
> > > with panic_on_warn.
> >
> > The warning makes the function consistent with the other weak functions
> > in stacktrace.c and clarifies that it should never be called unless an
> > arch has misconfigured something. And if we aren't even checking the
> > specific ENOSYS error as I proposed then this warning would make the
> > error more obvious.
>
> I consider both WARN() and error value as superfluous. I like the
> error value because it allows users to handle the situation as
> they need it.

... but if there are no users of the weak function then the point is
moot.

> Best Regards,
> Petr
>
> PS: This is my last mail in the thread this week. I will eventually
> return to it with a clear head next week. It is all nitpicking from
> my POV and I have better things to do.

I don't think it's helpful to characterize it as nitpicking. The
details of the code are important.

--
Josh