Re: [PATCH -next] slub: Replace __this_cpu_inc usage w/ SLUB_STATS

From: Johannes Hirte
Date: Mon Apr 14 2014 - 09:42:22 EST


On Thu, 6 Mar 2014 12:29:41 -0600
Josh Cartwright <joshc@xxxxxxxxxxxxxx> wrote:

> On Thu, Mar 06, 2014 at 09:53:16AM -0600, Josh Cartwright wrote:
> > Booting on my Samsung Series 9 laptop gives me loads and loads of
> > BUGs triggered by __this_cpu_add(), making making the system
> > completely unusable:
> >
> > [ 5.808326] BUG: using __this_cpu_add() in preemptible
> > [00000000] code: swapper/0/1 [ 5.812331] caller is
> > __this_cpu_preempt_check+0x2b/0x30 [ 5.815654] CPU: 0 PID: 1
> > Comm: swapper/0 Not tainted
> > 3.14.0-rc5-next-20140306-joshc-08290-g0ffb2fe #1 [ 5.819553]
> > Hardware name: SAMSUNG ELECTRONICS CO., LTD.
> > 900X3C/900X3D/900X3E/900X4C/900X4D/NP900X3E-A02US, BIOS P07ABK
> > 04/09/2013 [ 5.823558] ffff8801182157c0 ffff880118215790
> > ffffffff81a64cec 0000000000000000 [ 5.827177] ffff8801182157b0
> > ffffffff81462360 ffff8800c3d553e0 ffffea00030f5500 [ 5.830744]
> > ffff8801182157e8 ffffffff814623bb 635f736968745f5f 29286464615f7570
> > [ 5.834134] Call Trace: [ 5.836848] [<ffffffff81a64cec>]
> > dump_stack+0x4e/0x7a [ 5.839943] [<ffffffff81462360>]
> > check_preemption_disabled+0xd0/0xe0 [ 5.842997]
> > [<ffffffff814623bb>] __this_cpu_preempt_check+0x2b/0x30
> > [ 5.846022] [<ffffffff81a6331d>] __slab_free+0x38/0x590
> > [ 5.848863] [<ffffffff811759dd>] ? get_parent_ip+0xd/0x50
> > [ 5.850467] BUG: using __this_cpu_add() in preemptible
> > [00000000] code: khubd/36 [ 5.850472] caller is
> > __this_cpu_preempt_check+0x2b/0x30 [ 5.859125]
> > [<ffffffff81175b3b>] ? preempt_count_sub+0x6b/0xf0 [ 5.862521]
> > [<ffffffff81a7175a>] ? _raw_spin_unlock_irqrestore+0x4a/0x80
> > [ 5.865599] [<ffffffff81462e5e>] ?
> > __debug_check_no_obj_freed+0x13e/0x240 [ 5.868738]
> > [<ffffffff814623bb>] ? __this_cpu_preempt_check+0x2b/0x30
> > [ 5.871799] [<ffffffff81287327>] kfree+0x2f7/0x300
>
> FWIW, it looks like the magic combination of options are:
> - CONFIG_DEBUG_PREEMPT=y
> - CONFIG_SLUB=y
> - CONFIG_SLUB_STATS=y
>
> Looks like the new percpu() checks are complaining about SLUB's use of
> __this_cpu_inc() for maintaining it's stat counters. The below patch
> seems to fix it.
>
> Although, I'm wondering how exact these statistics need to be. Is
> making them preemption safe even a concern?
>

Looks like there is a similar issue in touch_softlockup_watchdog too:

Apr 14 14:56:01 localhost kernel: BUG: using __this_cpu_write() in
preemptible [00000000] code: systemd-udevd/1307
Apr 14 14:56:01 localhost kernel: caller is
touch_softlockup_watchdog+0x11/0x1f
Apr 14 14:56:01 localhost kernel: CPU: 0 PID: 1307 Comm: systemd-udevd
Tainted: G W 3.15.0-rc1 #44
Apr 14 14:56:01 localhost kernel: Hardware name: Hewlett-Packard HP
ProBook 6450b/146D, BIOS 68CDE Ver. F.23 06/13/2012
Apr 14 14:56:01 localhost kernel: 0000000000000000 ffffffff815b6385
0000000000000000 ffffffff813005a4
Apr 14 14:56:01 localhost kernel: 0000000000000000 0000000000000032
00000000000003e8 ffffffff810c63bc
Apr 14 14:56:01 localhost kernel: ffffffff81332592 ffff8800b4ea8800
0000000000000000 ffff8800b686e030
Apr 14 14:56:01 localhost kernel: Call Trace:
Apr 14 14:56:01 localhost kernel: [<ffffffff815b6385>] ?
dump_stack+0x4a/0x75
Apr 14 14:56:01 localhost kernel: [<ffffffff813005a4>] ?
check_preemption_disabled+0xd6/0xe5
Apr 14 14:56:01 localhost kernel: [<ffffffff810c63bc>] ?
touch_softlockup_watchdog+0x11/0x1f
Apr 14 14:56:01 localhost kernel: [<ffffffff81332592>] ?
acpi_os_stall+0x2f/0x36
Apr 14 14:56:01 localhost kernel: [<ffffffff8134b64a>] ?
acpi_ex_system_do_stall+0x34/0x37
Apr 14 14:56:01 localhost kernel: [<ffffffff813411d4>] ?
acpi_ds_exec_end_op+0xcc/0x3d5
Apr 14 14:56:01 localhost kernel: [<ffffffff81351fcf>] ?
acpi_ps_parse_loop+0x50c/0x564
Apr 14 14:56:01 localhost kernel: [<ffffffff81352a21>] ?
acpi_ps_parse_aml+0x93/0x26f
Apr 14 14:56:01 localhost kernel: [<ffffffff813531eb>] ?
acpi_ps_execute_method+0x1b6/0x25f
Apr 14 14:56:01 localhost kernel: [<ffffffff8134debe>] ?
acpi_ns_evaluate+0x1ba/0x247
Apr 14 14:56:01 localhost kernel: [<ffffffff81350557>] ?
acpi_evaluate_object+0x122/0x231
Apr 14 14:56:01 localhost kernel: [<ffffffffa005a230>] ?
lis3lv02d_acpi_init+0x1c/0x27 [hp_accel]
Apr 14 14:56:01 localhost kernel: [<ffffffffa005320a>] ?
lis3lv02d_poweron+0xe/0xca [lis3lv02d]
Apr 14 14:56:01 localhost kernel: [<ffffffffa0053b16>] ?
lis3lv02d_init_device+0x22a/0x4e5 [lis3lv02d]
Apr 14 14:56:01 localhost kernel: [<ffffffffa005a347>] ?
lis3lv02d_add+0x10c/0x18a [hp_accel]
Apr 14 14:56:01 localhost kernel: [<ffffffff81335d82>] ?
acpi_device_probe+0x3d/0xeb
Apr 14 14:56:01 localhost kernel: [<ffffffff81418e8b>] ?
driver_probe_device+0x97/0x1b8
Apr 14 14:56:01 localhost kernel: [<ffffffff8141903a>] ?
__driver_attach+0x58/0x78
Apr 14 14:56:01 localhost kernel: [<ffffffff81418fe2>] ?
__device_attach+0x36/0x36
Apr 14 14:56:01 localhost kernel: [<ffffffff81417650>] ?
bus_for_each_dev+0x73/0x7d
Apr 14 14:56:01 localhost kernel: [<ffffffff814186f4>] ?
bus_add_driver+0x105/0x1ce
Apr 14 14:56:01 localhost kernel: [<ffffffff81419577>] ?
driver_register+0x88/0xc0
Apr 14 14:56:01 localhost kernel: [<ffffffffa005f000>] ?
0xffffffffa005efff
Apr 14 14:56:01 localhost kernel: [<ffffffff8100029e>] ?
do_one_initcall+0x7d/0x101
Apr 14 14:56:01 localhost kernel: [<ffffffff815be854>] ?
notifier_call_chain+0x37/0x57
Apr 14 14:56:01 localhost kernel: [<ffffffff81076cd2>] ?
__blocking_notifier_call_chain+0x53/0x60
Apr 14 14:56:01 localhost kernel: [<ffffffff810b0740>] ?
load_module+0x19f6/0x1ba7
Apr 14 14:56:01 localhost kernel: [<ffffffff810ad754>] ?
module_flags+0x74/0x74
Apr 14 14:56:01 localhost kernel: [<ffffffff810b09de>] ?
SyS_finit_module+0x4f/0x63
Apr 14 14:56:01 localhost kernel: [<ffffffff815c199f>] ?
tracesys+0xdd/0xe2

kernel/watchdog.c:

void touch_softlockup_watchdog(void)
{
__this_cpu_write(watchdog_touch_ts, 0);
}
EXPORT_SYMBOL(touch_softlockup_watchdog);

Don't know if the change to this_cpu_write() is the right way here too.

regards,
Johannes
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/