[PATCHv2] lockdep: avoid signed overflow

From: Mark Rutland
Date: Thu Mar 16 2017 - 08:25:30 EST


The counters in struct lockdep_stats are all (signed) ints. For some
counters (e.g. hardirqs_on_events, hardirqs_off_events), it's easy to
trigger an overflow in a short period of time, rendering the information
exposed under /proc/lockdep_stats erroneous, and causing UBSAN to
scream.

Since all counters can never validly be negative, we can make them
unsigned, which is sufficient to placate UBSAN. However, this merely
doubles the time until an overflow silently occurs, rendering the
information under /proc/lockdep_stats erroneous.

To avoid the undefined behaviour and to avoid erroneous results, this
patch upgrades all of the counters to unsigned long long, which cannot
overflow in a reasonable time frame.

We don't need to change anything else, since the counters are only
manipulated via this_cpu inc/dec operations, which are unaffected by the
type. The counters are only read when generating /proc/lockdep_stats,
where they are summed into unsigned long longs.

Common code already uses 64-bit values with this_cpu operations, so this
shouldn't be problematic for 32-bit parts.

Original UBSAN splats:

[137987.143729] ================================================================================
[137987.152266] UBSAN: Undefined behaviour in kernel/locking/lockdep.c:2696:2
[137987.159140] signed integer overflow:
[137987.162801] 2147483647 + 1 cannot be represented in type 'int'
[137987.168723] CPU: 0 PID: 24149 Comm: qemu-system-aar Not tainted 4.11.0-rc1-00002-g23187dc #12
[137987.177329] Hardware name: AMD Seattle (Rev.B0) Development Board (Overdrive) (DT)
[137987.184980] Call trace:
[137987.187523] [<ffff200008094638>] dump_backtrace+0x0/0x538
[137987.193010] [<ffff200008094b90>] show_stack+0x20/0x30
[137987.198150] [<ffff200008bd4c00>] dump_stack+0x120/0x188
[137987.203463] [<ffff200008c7c588>] ubsan_epilogue+0x18/0x98
[137987.208948] [<ffff200008c7da24>] handle_overflow+0x1f0/0x234
[137987.214695] [<ffff200008c7da9c>] __ubsan_handle_add_overflow+0x34/0x44
[137987.221311] [<ffff200008265510>] trace_hardirqs_on_caller+0x728/0x768
[137987.227838] [<ffff200008265560>] trace_hardirqs_on+0x10/0x18
[137987.233585] [<ffff200009fc1b7c>] yield_to+0x3c4/0x700
[137987.238725] [<ffff2000080db340>] kvm_vcpu_yield_to+0x138/0x2f8
[137987.244646] [<ffff2000080db948>] kvm_vcpu_on_spin+0x448/0x718
[137987.250480] [<ffff2000080fed20>] kvm_handle_wfx+0x80/0x228
[137987.256053] [<ffff2000080ff150>] handle_exit+0x258/0x5b0
[137987.261453] [<ffff2000080f11e4>] kvm_arch_vcpu_ioctl_run+0x7ac/0x1170
[137987.267980] [<ffff2000080d85ac>] kvm_vcpu_ioctl+0x6bc/0xe30
[137987.273641] [<ffff2000085fb774>] do_vfs_ioctl+0x194/0x14a0
[137987.279213] [<ffff2000085fcb28>] SyS_ioctl+0xa8/0xb8
[137987.284265] [<ffff200008084770>] el0_svc_naked+0x24/0x28
[137987.289659] ================================================================================
[137987.298322] ================================================================================
[137987.306912] UBSAN: Undefined behaviour in kernel/locking/lockdep.c:2775:3
[137987.313791] signed integer overflow:
[137987.317457] 2147483647 + 1 cannot be represented in type 'int'
[137987.323384] CPU: 0 PID: 24149 Comm: qemu-system-aar Not tainted 4.11.0-rc1-00002-g23187dc #12
[137987.331997] Hardware name: AMD Seattle (Rev.B0) Development Board (Overdrive) (DT)
[137987.339654] Call trace:
[137987.342197] [<ffff200008094638>] dump_backtrace+0x0/0x538
[137987.347691] [<ffff200008094b90>] show_stack+0x20/0x30
[137987.352836] [<ffff200008bd4c00>] dump_stack+0x120/0x188
[137987.358155] [<ffff200008c7c588>] ubsan_epilogue+0x18/0x98
[137987.363648] [<ffff200008c7da24>] handle_overflow+0x1f0/0x234
[137987.369401] [<ffff200008c7da9c>] __ubsan_handle_add_overflow+0x34/0x44
[137987.376023] [<ffff20000825c8a0>] trace_hardirqs_off_caller+0x2b8/0x420
[137987.382645] [<ffff20000825ca18>] trace_hardirqs_off+0x10/0x18
[137987.388484] [<ffff200008083fb0>] el1_irq+0x70/0x130
[137987.393457] [<ffff2000080db340>] kvm_vcpu_yield_to+0x138/0x2f8
[137987.399384] [<ffff2000080db948>] kvm_vcpu_on_spin+0x448/0x718
[137987.405225] [<ffff2000080fed20>] kvm_handle_wfx+0x80/0x228
[137987.410804] [<ffff2000080ff150>] handle_exit+0x258/0x5b0
[137987.416211] [<ffff2000080f11e4>] kvm_arch_vcpu_ioctl_run+0x7ac/0x1170
[137987.422746] [<ffff2000080d85ac>] kvm_vcpu_ioctl+0x6bc/0xe30
[137987.428412] [<ffff2000085fb774>] do_vfs_ioctl+0x194/0x14a0
[137987.433991] [<ffff2000085fcb28>] SyS_ioctl+0xa8/0xb8
[137987.439049] [<ffff200008084770>] el0_svc_naked+0x24/0x28
[137987.444451] ================================================================================

Signed-off-by: Mark Rutland <mark.rutland@xxxxxxx>
Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Cc: Ingo Molnar <mingo@xxxxxxxxxx>
Cc: linux-kernel@xxxxxxxxxxxxxxx
---
kernel/locking/lockdep_internals.h | 34 +++++++++++++++++-----------------
1 file changed, 17 insertions(+), 17 deletions(-)

Since v1 [1]:
* Use unsigned long long to ensure counters are wide enough on 32-bit parts

Mark.

[1] https://lkml.kernel.org/r/1489595492-11745-1-git-send-email-mark.rutland@xxxxxxx

diff --git a/kernel/locking/lockdep_internals.h b/kernel/locking/lockdep_internals.h
index c2b8849..6e019f7 100644
--- a/kernel/locking/lockdep_internals.h
+++ b/kernel/locking/lockdep_internals.h
@@ -132,23 +132,23 @@ extern void get_usage_chars(struct lock_class *class,
* and we want to avoid too much cache bouncing.
*/
struct lockdep_stats {
- int chain_lookup_hits;
- int chain_lookup_misses;
- int hardirqs_on_events;
- int hardirqs_off_events;
- int redundant_hardirqs_on;
- int redundant_hardirqs_off;
- int softirqs_on_events;
- int softirqs_off_events;
- int redundant_softirqs_on;
- int redundant_softirqs_off;
- int nr_unused_locks;
- int nr_cyclic_checks;
- int nr_cyclic_check_recursions;
- int nr_find_usage_forwards_checks;
- int nr_find_usage_forwards_recursions;
- int nr_find_usage_backwards_checks;
- int nr_find_usage_backwards_recursions;
+ unsigned long long chain_lookup_hits;
+ unsigned long long chain_lookup_misses;
+ unsigned long long hardirqs_on_events;
+ unsigned long long hardirqs_off_events;
+ unsigned long long redundant_hardirqs_on;
+ unsigned long long redundant_hardirqs_off;
+ unsigned long long softirqs_on_events;
+ unsigned long long softirqs_off_events;
+ unsigned long long redundant_softirqs_on;
+ unsigned long long redundant_softirqs_off;
+ unsigned long long nr_unused_locks;
+ unsigned long long nr_cyclic_checks;
+ unsigned long long nr_cyclic_check_recursions;
+ unsigned long long nr_find_usage_forwards_checks;
+ unsigned long long nr_find_usage_forwards_recursions;
+ unsigned long long nr_find_usage_backwards_checks;
+ unsigned long long nr_find_usage_backwards_recursions;
};

DECLARE_PER_CPU(struct lockdep_stats, lockdep_stats);
--
1.9.1