Re: [PATCH v3 1/3] ptrace: Provide ___ptrace_may_access() that can be applied on arbitrary tasks

From: Andrea Arcangeli
Date: Wed Sep 05 2018 - 14:40:24 EST


On Wed, Sep 05, 2018 at 08:29:07PM +0200, Jiri Kosina wrote:
> (and no, my testing of the patch I sent on current tree didn't produce any
> hangs -- was there a reliable way to trigger it on 3.10?).

Only a very specific libvirt acceptance test found this after a while
and it wasn't a customer it was caught by QA. The reporter said it
wasn't sure about how to reproduce this issue either, it happened once
in a while the backtrace was still enough to fix it for sure and then
it never happened again.

It's not because of virt but probably because of selinux+audit. This
is precisely why I thought once you enter LSM from the scheduler
atomic path the trouble starts as each LSM implementation of those
calls may crash or not crash.

Perhaps you didn't sandbox KVM inside selinux by default?

This is the lockup the patch I posted fixed for 3.10.

[ 1838.621010] Kernel panic - not syncing: Watchdog detected hard LOCKUP on cpu 6
[ 1838.629070] CPU: 6 PID: 0 Comm: swapper/6 Not tainted 3.10.0-327.62.4.el7.x86_64 #1
[ 1838.637610] Hardware name: Dell Inc. PowerEdge R430/0CN7X8, BIOS 2.4.2 01/09/2017
[ 1838.645954] Call Trace:
[ 1838.648680] <NMI> [<ffffffff8163a05d>] dump_stack+0x19/0x1b
[ 1838.655113] [<ffffffff816338d0>] panic+0xd8/0x1e7
[ 1838.660460] [<ffffffff8111e960>] ? restart_watchdog_hrtimer+0x50/0x50
[ 1838.667742] [<ffffffff8111ea22>] watchdog_overflow_callback+0xc2/0xd0
[ 1838.675024] [<ffffffff81162211>] __perf_event_overflow+0xa1/0x250
[ 1838.681920] [<ffffffff81162ce4>] perf_event_overflow+0x14/0x20
[ 1838.688526] [<ffffffff810337c8>] intel_pmu_handle_irq+0x1e8/0x470
[ 1838.695423] [<ffffffff812f83cc>] ? ioremap_page_range+0x24c/0x330
[ 1838.702320] [<ffffffff811a9031>] ? unmap_kernel_range_noflush+0x11/0x20
[ 1838.709797] [<ffffffff813997f4>] ? ghes_copy_tofrom_phys+0x124/0x210
[ 1838.716984] [<ffffffff81399980>] ? ghes_read_estatus+0xa0/0x190
[ 1838.723687] [<ffffffff816444bb>] perf_event_nmi_handler+0x2b/0x50
[ 1838.730582] [<ffffffff81643c09>] nmi_handle.isra.0+0x69/0xb0
[ 1838.736992] [<ffffffff81643db9>] do_nmi+0x169/0x340
[ 1838.742532] [<ffffffff81642ff9>] end_repeat_nmi+0x1e/0x7e
[ 1838.748653] [<ffffffff81641bbd>] ? _raw_spin_lock_irqsave+0x3d/0x60
[ 1838.755742] [<ffffffff81641bbd>] ? _raw_spin_lock_irqsave+0x3d/0x60
[ 1838.762831] [<ffffffff81641bbd>] ? _raw_spin_lock_irqsave+0x3d/0x60
[ 1838.769917] <<EOE>> [<ffffffff816391e5>] avc_compute_av+0x126/0x1b5
[ 1838.777125] [<ffffffff810b842e>] ? walk_tg_tree_from+0xbe/0x110
[ 1838.783828] [<ffffffff8128b9c4>] avc_has_perm_noaudit+0xc4/0x110
[ 1838.790628] [<ffffffff8128f1fb>] cred_has_capability+0x6b/0x120
[ 1838.797331] [<ffffffff810db71c>] ? ktime_get+0x4c/0xd0
[ 1838.803160] [<ffffffff810e167b>] ? clockevents_program_event+0x6b/0xf0
[ 1838.810532] [<ffffffff8128f2de>] selinux_capable+0x2e/0x40
[ 1838.816748] [<ffffffff81288f65>] security_capable_noaudit+0x15/0x20
[ 1838.823829] [<ffffffff8108b975>] has_ns_capability_noaudit+0x15/0x20
[ 1838.831014] [<ffffffff8108bc55>] ptrace_has_cap+0x35/0x40
[ 1838.837126] [<ffffffff8108c717>] ___ptrace_may_access+0xa7/0x1e0
[ 1838.843925] [<ffffffff8163f0ae>] __schedule+0x26e/0xa00
[ 1838.849855] [<ffffffff81640949>] schedule_preempt_disabled+0x29/0x70
[ 1838.857041] [<ffffffff810d9324>] cpu_startup_entry+0x184/0x290
[ 1838.863637] [<ffffffff8104891a>] start_secondary+0x1da/0x250