Re: [BUG] Use of probe_kernel_address() in task_rcu_dereference() without checking return value

From: Linus Torvalds
Date: Fri Aug 30 2019 - 11:41:28 EST

Next message: Doug Ledford: "[PULL REQUEST] Please pull rdma.git"
Previous message: Stephen Smalley: "Re: [PATCH 10/11] selinux: Implement the watch_key security hook [ver #7]"
In reply to: Linus Torvalds: "Re: [BUG] Use of probe_kernel_address() in task_rcu_dereference() without checking return value"
Next in thread: Oleg Nesterov: "Re: [BUG] Use of probe_kernel_address() in task_rcu_dereference() without checking return value"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Fri, Aug 30, 2019 at 8:30 AM Linus Torvalds
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>
> Do you actually see that behavior?
>
> Because the foillowing lines:
>
> smp_rmb();
> if (unlikely(task != READ_ONCE(*ptask)))
> goto retry;

Side note: that code had better not be performance-critical, because
"probe_kernel_address()" is actually really really slow.

We really should do a real set of "read kernel with fault handling" functions.

We have *one* right now: load_unaligned_zeropad(), but that one
assumes that at least the first byte is valid and that the fault can
only be because of unaligned page crossers.

The problem with "probe_kernel_address()" is that it really just
re-uses the user access functions, and then plays games to make them
work for kernel addresses. Games we shouldn't play, and it's all very
expensive when it really shouldn't need to be. Including changing
limits, but also doing all the system instructions to allow user
accesses (PAN on ARM, clac/stac on x86).

Doing a set of "access kernel with error handling" should be trivial,
it's just that every architecture needs to do it. So we'd probably
need to do something where architectures can say "I have it", and fall
back on the silly legacy implementation otherwise..

Linus

Next message: Doug Ledford: "[PULL REQUEST] Please pull rdma.git"
Previous message: Stephen Smalley: "Re: [PATCH 10/11] selinux: Implement the watch_key security hook [ver #7]"
In reply to: Linus Torvalds: "Re: [BUG] Use of probe_kernel_address() in task_rcu_dereference() without checking return value"
Next in thread: Oleg Nesterov: "Re: [BUG] Use of probe_kernel_address() in task_rcu_dereference() without checking return value"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]