x86: disabled interrupts on shutdown causing tlb flush WARNs
From: Don Zickus
Date: Wed Jan 29 2014 - 13:55:37 EST
Hi Ingo,
A while ago patch 55c844a4dd16a4d1fdc0cf2a283ec631a02ec448 was committed
to deal with WARN_ONs during shutdown in native_smp_send_reschedule() (in
arch/x86/kernel/smp.c).
The solution at the time was to disable local interrupts to block the
timer interrupt from calling the reschedule function during
shutdown/reboot.
Lately, we have a customer who says that patch is causing a new WARN_ON in
kernel/smp.c::smp_call_function_many() because irqs are disabled.
It seems to be related to iounmap calling flush_tlb_kernel_range.
stack is:
[ 3255.956295] WARNING: at kernel/smp.c:387 smp_call_function_many+0xaf/0x2c0()
[ 3255.956295] Modules linked in: nf_conntrack_netbios_ns
nf_conntrack_broadcast ipt_MASQUERADE ip6table_mangle ip6table_security
ip6table_raw ip6t_REJECT iptable_nat nf_nat_ipv4 iptable_mangle
iptable_security iptable_raw ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4
xt_conntrack ebtable_filter ebtables ip6table_filter sg iptable_filter
ip_tables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 nf_nat
nf_conntrack ip6_tables coretemp kvm_intel kvm crc32_pclmul crc32c_intel
ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper
cryptd microcode pcspkr be2net iTCO_wdt iTCO_vendor_support ixgbe ses ptp
enclosure pps_core hpilo hpwdt mdio lpc_ich mfd_core ioatdma shpchp dca
vfat fat acpi_cpufreq mperf dm_service_time sd_mod lpfc mgag200
syscopyarea qla2xxx crc_t10dif sysfillrect sysimgblt i2c_algo_bit
drm_kms_helper scsi_transport_fc ttm scsi_tgt drm i2c_core dm_multipath
dm_mirror dm_region_hash dm_log dm_mod
[ 3255.956295] CPU: 0 PID: 19237 Comm: reboot Not tainted 3.10.0-33.el7.x86_64 #1
[ 3255.956295] 0000000000000009 ffff901f4e42bb50 ffffffff815fdacb ffff901f4e42bb88
[ 3255.956295] ffffffff81058c61 0000000000000000 ffffffff81a08c20 0000000000000000
[ 3255.956295] 0000000000000000 ffffffff81a08c20 ffff901f4e42bb98 ffffffff81058d3a
[ 3255.956295] Call Trace:
[ 3255.956295] [<ffffffff815fdacb>] dump_stack+0x19/0x1b
[ 3255.956295] [<ffffffff81058c61>] warn_slowpath_common+0x61/0x80
[ 3255.956295] [<ffffffff81058d3a>] warn_slowpath_null+0x1a/0x20
[ 3255.956295] [<ffffffff810b843f>] smp_call_function_many+0xaf/0x2c0
[ 3255.956295] [<ffffffff811647ee>] ? __insert_vmap_area+0x8e/0xc0
[ 3255.956295] [<ffffffff8104c2a0>] ? flush_tlb_func+0xb0/0xb0
[ 3255.956295] [<ffffffff8104c2a0>] ? flush_tlb_func+0xb0/0xb0
[ 3255.956295] [<ffffffff810b86ad>] on_each_cpu+0x2d/0x60
[ 3255.956295] [<ffffffff8104c73a>] flush_tlb_kernel_range+0x4a/0x70
[ 3255.956295] [<ffffffff8116524c>] __purge_vmap_area_lazy+0x16c/0x1d0
[ 3255.956295] [<ffffffff8116549e>] free_vmap_area_noflush+0x5e/0x60
[ 3255.956295] [<ffffffff81166d5e>] remove_vm_area+0x5e/0x70
[ 3255.956295] [<ffffffff810478c7>] iounmap+0x67/0xa0
[ 3255.956295] [<ffffffff8134e3d6>] acpi_os_write_memory+0x89/0x9d
[ 3255.956295] [<ffffffff81368887>] acpi_hw_write+0x3d/0x4e
[ 3255.956295] [<ffffffff813691c6>] acpi_reset+0x4f/0x51
[ 3255.956295] [<ffffffff8134ed40>] acpi_reboot+0xb0/0xb8
[ 3255.956295] [<ffffffff81035cc6>] native_machine_emergency_restart+0x186/0x240
[ 3255.956295] [<ffffffff810383a2>] ? disconnect_bsp_APIC+0x82/0xc0
[ 3255.956295] [<ffffffff810357c7>] native_machine_restart+0x37/0x40
[ 3255.956295] [<ffffffff81035a3f>] machine_restart+0xf/0x20
[ 3255.956295] [<ffffffff81071375>] kernel_restart+0x45/0x60
[ 3255.956295] [<ffffffff810715b9>] SYSC_reboot+0x229/0x260
[ 3255.956295] [<ffffffff8119a3a6>] ? do_readv_writev+0x176/0x240
[ 3255.956295] [<ffffffff8119af93>] ? __fput+0x183/0x270
[ 3255.956295] [<ffffffff8119b1ae>] ? ____fput+0xe/0x10
[ 3255.956295] [<ffffffff8107161e>] SyS_reboot+0xe/0x10
[ 3255.956295] [<ffffffff8160cf99>] system_call_fastpath+0x16/0x1b
I was wondering if in the above referenced commit
(55c844a4dd16a4d1fdc0cf2), if changing the 'local_irqs_disabled()' to
'preempt_disable()' would address the previous problem and the current
problem.
I think my real question is, does preempt_disable() block
smp_send_reschedule() from executing? If not, then I need to find a
different solution.
Cheers,
Don
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/