Re: divide error: 0000 [#1] SMP in task_numa_migrate - handle_mm_fault vanilla 4.4.6
From: Yannis Aribaud
Date: Tue Jun 21 2016 - 08:13:30 EST
Hi everyone,
I recently it this bug in the kernel using a vanilla 4.6.2 release.
It seems that somewhere in the load average calculation a division by 0 occurs (see the stack trace
at the end).
After digging a bit (be fair it's my first time) in the kernel sources, I found that we "recently"
added the function cfs_rq_load_avg (commit 6f2b04524f0b38bfbb8413f98d2d6af234508309) and started
using it in the function task_h_load which do a division with the value returned
(kernel/sched/fair.c) like this:
static unsigned long task_h_load(struct task_struct *p)
{
struct cfs_rq *cfs_rq = task_cfs_rq(p);
update_cfs_rq_h_load(cfs_rq);
return div64_ul(p->se.avg.load_avg * cfs_rq->h_load,
cfs_rq_load_avg(cfs_rq) + 1);
}
But the load_avg filed from sched_avg struct is an atomic_long_t and the cfs_rq_load_avg returns
this field as an unsigned long without doing any type conversion.
static inline unsigned long cfs_rq_load_avg(struct cfs_rq *cfs_rq)
{
return cfs_rq->avg.load_avg;
}
I'm not an expert at all but I suspect that is the issue's origin. Shouldn't the function
cfs_rq_load_avg use an atomic_long_read() to avoid this ?
Here is the stack trace:
[534814.112500] divide error: 0000 [#1] SMP
[534814.112550] Modules linked in: vhost_net vhost macvtap macvlan ipmi_si mpt3sas raid_class
scsi_transport_sas ipmi_devintf dell_rbu tun nfsd auth_rpcgss oid_registry nfs_acl nfs lockd grace
fscache sunrpc bridge 8021q garp mrp stp llc bonding xfs libcrc32c bcache usbhid hid uhci_hcd
ohci_hcd x86_pkg_temp_thermal coretemp kvm_intel kvm irqbypass ghash_clmulni_intel iTCO_wdt
iTCO_vendor_support sha256_generic hmac drbg dcdbas ansi_cprng aesni_intel aes_x86_64 ablk_helper
cryptd lrw gf128mul glue_helper shpchp evdev sb_edac edac_core ehci_pci ehci_hcd lpc_ich usbcore
mfd_core usb_common ipmi_msghandler acpi_cpufreq wmi tpm_tis tpm processor acpi_power_meter button
ext4 crc16 jbd2 mbcache sg sd_mod dm_mod crc32c_intel igb megaraid_sas i2c_algo_bit i2c_core dca
ptp scsi_mod pps_core [last unloaded: ipmi_si]
[534814.113345] CPU: 10 PID: 38568 Comm: ceph-osd Not tainted 4.6.2-ig1virt #16
[534814.113390] Hardware name: Dell Inc. PowerEdge R730xd/0H21J3, BIOS 1.1.4 11/03/2014
[534814.113458] task: ffff88100cf5ef00 ti: ffff8814827e0000 task.ti: ffff8814827e0000
[534814.113525] RIP: 0010:[<ffffffff8106cfd7>] [<ffffffff8106cfd7>] task_h_load+0x4f/0xc7
[534814.113613] RSP: 0000:ffff8814827e3c00 EFLAGS: 00010256
[534814.113654] RAX: 0000000000000000 RBX: 00000000000000d7 RCX: 0000000000000000
[534814.113720] RDX: 0000000000000000 RSI: ffff88103d8a5f00 RDI: ffff88100cf5ef00
[534814.113786] RBP: ffff8814827e3c90 R08: 0000000107f70c76 R09: 0000000000000000
[534814.113851] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000001
[534814.113917] R13: 0000000000000015 R14: 0000000000000000 R15: ffff88207ec14580
[534814.113984] FS: 00007eff83cbb700(0000) GS:ffff88107f4a0000(0000) knlGS:0000000000000000
[534814.114053] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[534814.114095] CR2: 00000000146b07c0 CR3: 000000103f605000 CR4: 00000000001426e0
[534814.119456] Stack:
[534814.119488] ffffffff8106fc4c ffff88103d639400 00000000000000d7 0000000000014580
[534814.119571] fffffffffffffe19 ffff88107f4a0000 00000000000000d7 0000000000000027
[534814.119653] ffff88100cf5ef00 000000000000025f 0000000000000100 0000000000000188
[534814.119736] Call Trace:
[534814.119774] [<ffffffff8106fc4c>] ? task_numa_find_cpu+0x1d2/0x2ec
[534814.119819] [<ffffffff8106fe86>] ? task_numa_migrate+0x120/0x328
[534814.119864] [<ffffffff81067829>] ? ttwu_do_wakeup+0xf/0xcd
[534814.119907] [<ffffffff81071176>] ? task_numa_fault+0x912/0x9a9
[534814.119954] [<ffffffff81128568>] ? mpol_misplaced+0x138/0x14a
[534814.120001] [<ffffffff8110f39d>] ? handle_mm_fault+0xe28/0xf31
[534814.120046] [<ffffffff8113db0b>] ? fput+0xd/0x81
[534814.120087] [<ffffffff8103cd91>] ? __do_page_fault+0x425/0x485
[534814.120131] [<ffffffff813c85a2>] ? page_fault+0x22/0x30
[534814.120171] Code: 63 92 38 09 00 00 48 8b 80 b8 00 00 00 48 8b 04 d0 75 1c 48 8b 86 b0 00 00 00
48 8b 4e 78 31 d2 48 0f af 87 58 01 00 00 48 ff c1 <48> f7 f1 c3 48 c7 86 c0 00 00 00 00 00 00 00
48 89 f1 eb 18 48
[534814.120582] RIP [<ffffffff8106cfd7>] task_h_load+0x4f/0xc7
[534814.120628] RSP <ffff8814827e3c00>
[534814.121242] ---[ end trace ca72a3c25fb6f0dc ]---
Best regards,
--
Yannis Aribaud
--
Yannis Aribaud