Re: [PATCH] general protection fault in sock_has_perm

From: Mark Salyzyn
Date: Fri Jan 19 2018 - 12:47:10 EST


On 01/19/2018 09:06 AM, Paul Moore wrote:
On Fri, Jan 19, 2018 at 10:49 AM, Mark Salyzyn <salyzyn@xxxxxxxxxxx> wrote:
On 01/18/2018 02:36 PM, Paul Moore wrote:
On Thu, Jan 18, 2018 at 4:58 PM, Mark Salyzyn <salyzyn@xxxxxxxxxxx> wrote:
general protection fault: 0000 [#1] PREEMPT SMP KASAN
CPU: 1 PID: 14233 Comm: syz-executor2 Not tainted 4.4.112-g5f6325b #28
. . .
[<ffffffff81b6a19d>] selinux_socket_setsockopt+0x4d/0x80
security/selinux/hooks.c:4338
[<ffffffff81b4873d>] security_socket_setsockopt+0x7d/0xb0
security/security.c:1257
[<ffffffff82df1ac8>] SYSC_setsockopt net/socket.c:1757 [inline]
[<ffffffff82df1ac8>] SyS_setsockopt+0xe8/0x250 net/socket.c:1746
[<ffffffff83776499>] entry_SYSCALL_64_fastpath+0x16/0x92
Code: c2 42 9b b6 81 be 01 00 00 00 48 c7 c7 a0 cb 2b 84 e8
f7 2f 6d ff 49 8d 7d 10 48 b8 00 00 00 00 00 fc ff df 48 89
fa 48 c1 ea 03 <0f> b6 04 02 84 c0 74 08 3c 03 0f 8e 83 01 00
00 41 8b 75 10 31
RIP [<ffffffff81b69b7e>] sock_has_perm+0x1fe/0x3e0
security/selinux/hooks.c:4069
RSP <ffff8800b5957ce0>
---[ end trace 7b5aaf788fef6174 ]---

In the absence of commit a4298e4522d6 ("net: add SOCK_RCU_FREE socket
flag") and all the associated infrastructure changes to take advantage
of a RCU grace period before freeing, there is a heightened
possibility that a security check is performed while an ill-timed
setsockopt call races in from user space. It then is prudent to null
check sk_security, and if the case, reject the permissions.

This adjustment is orthogonal to infrastructure improvements that may
nullify the needed check, but should be added as good code hygiene.
I'm skeptical that this is the full solution for systems that lack the
SOCK_RCU_FREE protection. Is this really limited to just
setsockopt()?
Maybe overstepped in my analysis and assumptions?

This is a result of a fuzzer hitting an android 4.4 KASAN kernel. We (so
far) have _not_ seen this with an android 4.9 KASAN kernel (which has the
SOCK_RCU_FREE adjustments). There is no standalone duplication or PoC
_except_ via the fuzzer. The rest of the statements stands based on this
tidbit (statements of general good code hygiene, not 100% sure SOCK_RCU_FREE
usage is completely covered, KISS solution etc).

To be honest, yes, this may be a layer in the onion (swat this NULL check
does not by itself solve the _problem_), I'd prefer kernel continuing on in
a rational manner rather than panic ... and I have a gut feeling this could
be a gratuitous NULL check if all the bugs in the network layer have been
solved <that may be sarcasm, I can not tell>. Programming to solve a problem
with one's gut is not a good practice, but hygiene is. This is 10
characters, and an estimated 1.2ns of added hygiene.

No, I do not think this is limited to setsockopt, but would be willing to
believe a multithreaded attack of any socket functions or ioctl would drop
down to the check with sock_has_perm at possibly the wrong time in socket
teardown.
I'm not necessarily opposed to adding additional safety checks, if
warranted, but I am opposed to adding a single check and declaring
mission accomplished when there is a suspicion that additional checks
may be needed.

Perhaps in this particular case it really is only setsockopt(), but
from what I can tell from your comments and the SOCK_RCU_FREE commit
message it would appear that there is a race condition here between a
socket's lifetime and its visibility to userspace.

I will need to fix my comments to be clearer ... (besides, I got the error return wrong, so I will have to respin it anyways).

In later kernels SOCK_RCU_FREE _appears_ to fix the race condition. In earlier kernels there is _no_ SOCK_RCU_FREE infrastructure, and thus the race condition exists. That race conditions _should_ have been solved in ToT. I have evaluated that porting all the SOCK_RCU_FREE refactoring may be too risky an endeavor for stable kernels though (maybe I am wrong about that? The author of those fixes edumazet@xxxxxxxxxx has been notified, have yet to receive a response).

Because I can not confirm that SOCK_RCU_FREE solves the problem, I am making a case for this to land in 4.9+ to ToT kernels as good hygiene. But maybe this should only go in 4.4- stable ...

-- Mark