Re: 1e918876 breaks r8169 (linux-3.18+)

From: Daniel Exner
Date: Thu May 07 2015 - 15:23:16 EST


Hi,

first of all, please keep me in CC as I am currently not subscribed to LKML.

> From: Tomas Szepe <szepe@xxxxxxxxxxxxxxx>
> Date: Sun, 22 Feb 2015 01:41:51 +0100
>
>>>> Sure, just did. Unfortunately, 3.19.0 + 0bec3b70 + this patch results
>>>> in a driver that retains the problem.
>>>
>>> OK, could you test following patch instead ?
>>
>> Yup, but tough luck: 3.19.0 + 0bec3b70 + this patch -> problem present.
>
> I'm reverting the two commits for now, as below.
>
> We can put them back in if we can resolve the problems.
>
> ====================
> [PATCH] r8169: Revert BQL and xmit_more support.

I observe exactly the same problems since Kernel 4.1 merge window,
_although_ this patch is in the tree.
So it must be something else.

Stacktrace 1 (still with rc1):
> WARNING: CPU: 5 PID: 0 at net/sched/sch_generic.c:303
> dev_watchdog+0x23f/0x250()
> NETDEV WATCHDOG: em1 (r8169): transmit queue 0 timed out
> Modules linked in: af_packet rfcomm uvcvideo videobuf2_vmalloc
> videobuf2_memops videobuf2_core v4l2_common videodev media bnep raid456
> nf_conntrack_ipv4 async_raid6_recov nf_defrag_ipv4 async_memcpy
> snd_usb_audio snd_usbmidi_lib async_pq async_xor xt_tcpudp xt_limit xor
> async_tx xt_conntrack nf_conntrack xt_multiport btusb btbcm btintel
> bluetooth rfkill joydev iptable_filter evdev mac_hid ip_tables x_tables msr
> raid6_pq w83627ehf md_mod hwmon_vid amdkfd amd_iommu_v2 kvm_amd kvm
> snd_seq_dummy radeon snd_seq_oss microcode snd_emu10k1_synth snd_emux_synth
> psmouse snd_seq_midi_emul snd_seq_virmidi pcspkr serio_raw ttm snd_emu10k1
> edac_core drm_kms_helper edac_mce_amd k10temp snd_hda_codec_realtek
> snd_util_mem snd_hda_codec_generic snd_hda_codec_hdmi drm snd_ac97_codec
> snd_hda_intel snd_hda_controller agpgart emu10k1_gp ac97_bus gameport
> i2c_algo_bit snd_hda_codec nuvoton_cir rc_core 8250_fintek snd_hda_core
> acpi_cpufreq sp5100_tco snd_hwdep snd_pcm i2c_piix4 button shpchp processor
> thermal_sys hwmon ppdev snd_seq_midi snd_seq_midi_event snd_seq snd_rawmidi
> snd_seq_device snd_timer snd soundcore sch_fq_codel fuse binfmt_misc lp
> parport_pc parport ext4 crc16 jbd2 mbcache hid_generic sr_mod cdrom sd_mod
> hid_microsoft usbhid hid ahci libahci r8169 mii firewire_ohci firewire_core
> crc_itu_t xhci_pci ehci_pci ohci_pci xhci_hcd ehci_hcd ohci_hcd libata
> usbcore scsi_mod usb_common sunrpc dm_mirror dm_region_hash dm_log dm_mod
> CPU: 5 PID: 0 Comm: swapper/5 Not tainted 4.1.0-rc1-36580-g385d5ed-dirty #28
> Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./970 Extreme3
> R2.0, BIOS P1.60 06/05/2014
> 0000000000000000 ffffffff815f162c ffffffff814b23ce ffff88023fd43e08
> ffffffff810526b7 0000000000000000 ffff8802362fc3a0 ffff8802362fc000
> 0000000000000005 0000000000000001 ffffffff81052735 ffffffff815f5700
> Call Trace:
> <IRQ> [<ffffffff814b23ce>] ? dump_stack+0x47/0x67
> [<ffffffff810526b7>] ? warn_slowpath_common+0x77/0xb0
> [<ffffffff81052735>] ? warn_slowpath_fmt+0x45/0x50
> [<ffffffff813eaa8f>] ? dev_watchdog+0x23f/0x250
> [<ffffffff813ea850>] ? dev_graft_qdisc+0x80/0x80
> [<ffffffff810a8fe5>] ? call_timer_fn.isra.26+0x15/0x80
> [<ffffffff81067000>] ? execute_in_process_context+0x60/0x60
> [<ffffffff813ea850>] ? dev_graft_qdisc+0x80/0x80
> [<ffffffff810a9218>] ? run_timer_softirq+0x1c8/0x270
> [<ffffffff81055eac>] ? __do_softirq+0x10c/0x220
> [<ffffffff8105612e>] ? irq_exit+0x8e/0xb0
> [<ffffffff81036439>] ? smp_apic_timer_interrupt+0x39/0x50
> [<ffffffff814b829e>] ? apic_timer_interrupt+0x6e/0x80
> <EOI> [<ffffffff8103f0a2>] ? native_safe_halt+0x2/0x10
> [<ffffffff8100c865>] ? default_idle+0x5/0x10
> [<ffffffff8100c9a6>] ? amd_e400_idle+0x66/0xe0
> [<ffffffff8108b429>] ? cpu_startup_entry+0x319/0x390
> [<ffffffff81034520>] ? start_secondary+0x170/0x180
> ---[ end trace c308ca6d62e7a991 ]---

Puzzling as this is a Phenom II X6 and no E400..

Stacktrace 2 (rc2):
> ------------[ cut here ]------------
> WARNING: CPU: 5 PID: 35 at net/sched/sch_generic.c:303
> dev_watchdog+0x23f/0x250()
> NETDEV WATCHDOG: em1 (r8169): transmit queue 0 timed out
> Modules linked in: af_packet rfcomm bnep raid456 async_raid6_recov
> async_memcpy async_pq uvcvideo videobuf2_vmalloc videobuf2_memops
> videobuf2_core v4l2_common videodev async_xor media xor async_tx
> nf_conntrack_ipv4 nf_defrag_ipv4 xt_tcpudp xt_limit xt_conntrack
> nf_conntrack xt_multiport msr iptable_filter ip_tables snd_usb_audio btusb
> x_tables btbcm snd_usbmidi_lib btintel bluetooth raid6_pq rfkill joydev
> w83627ehf hwmon_vid md_mod mac_hid evdev kvm_amd amdkfd kvm amd_iommu_v2
> snd_seq_dummy microcode snd_seq_oss radeon snd_emu10k1_synth snd_emux_synth
> snd_seq_midi_emul snd_seq_virmidi snd_hda_codec_realtek ttm
> snd_hda_codec_generic edac_core psmouse serio_raw edac_mce_amd pcspkr
> snd_hda_codec_hdmi k10temp drm_kms_helper snd_emu10k1 drm snd_util_mem
> snd_hda_intel snd_ac97_codec agpgart emu10k1_gp snd_hda_controller ac97_bus
> gameport i2c_algo_bit snd_hda_codec snd_hda_core sp5100_tco snd_hwdep
> nuvoton_cir snd_pcm 8250_fintek rc_core i2c_piix4 shpchp button acpi_cpufreq
> processor thermal_sys ppdev snd_seq_midi snd_seq_midi_event snd_seq
> snd_rawmidi snd_seq_device snd_timer snd soundcore asus_atk0110 hwmon
> sch_fq_codel fuse lp binfmt_misc parport_pc parport ext4 crc16 jbd2 mbcache
> hid_generic sr_mod cdrom sd_mod hid_microsoft usbhid hid firewire_ohci ahci
> r8169 firewire_core libahci mii crc_itu_t xhci_pci ohci_pci ehci_pci
> ohci_hcd xhci_hcd ehci_hcd libata usbcore scsi_mod usb_common sunrpc
> dm_mirror dm_region_hash dm_log dm_mod
> CPU: 5 PID: 35 Comm: ksoftirqd/5 Not tainted 4.1.0-rc2-36683-g979f4b5-dirty
> #29
> Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./970 Extreme3
> # R2.0, BIOS P1.60 06/05/2014
> 0000000000000000 ffffffff815f162c ffffffff814b233e ffff88023620fce8
> ffffffff810526b7 0000000000000000 ffff880230d683a0 ffff880230d68000
> 0000000000000005 0000000000000001 ffffffff81052735 ffffffff815f5700
> Call Trace:
> [<ffffffff814b233e>] ? dump_stack+0x47/0x67
> [<ffffffff810526b7>] ? warn_slowpath_common+0x77/0xb0
> [<ffffffff81052735>] ? warn_slowpath_fmt+0x45/0x50
> [<ffffffff813eaa0f>] ? dev_watchdog+0x23f/0x250
> [<ffffffff813ea7d0>] ? dev_graft_qdisc+0x80/0x80
> [<ffffffff810a8fe5>] ? call_timer_fn.isra.26+0x15/0x80
> [<ffffffff813ea7d0>] ? dev_graft_qdisc+0x80/0x80
> [<ffffffff810a9218>] ? run_timer_softirq+0x1c8/0x270
> [<ffffffff81055eac>] ? __do_softirq+0x10c/0x220
> [<ffffffff81055fe9>] ? run_ksoftirqd+0x29/0x50
> [<ffffffff8106ff55>] ? smpboot_thread_fn+0x135/0x250
> [<ffffffff8106fe20>] ? sort_range+0x20/0x20
> [<ffffffff8106d0be>] ? kthread+0xce/0xf0
> [<ffffffff81070303>] ? smpboot_register_percpu_thread+0x63/0xf0
> [<ffffffff8106cff0>] ? kthread_create_on_node+0x180/0x180
> [<ffffffff814b77a2>] ? ret_from_fork+0x42/0x70
> [<ffffffff8106cff0>] ? kthread_create_on_node+0x180/0x180
> ---[ end trace 4f88915aa0200ae6 ]---


I would have bisected it but its hard to trigger.

One thing that sticks out in the log:
> r8169 0000:05:00.0: can't disable ASPM; OS doesn't have ASPM control

but:
> acpi PNP0A03:00: _OSC: OS supports [ExtendedConfig ASPM ClockPM Segments MSI]
> acpi PNP0A03:00: _OSC failed (AE_NOT_FOUND); disabling ASPM


Regards,
Daniel Exner
--
Daniel Exner
Public-Key: https://www.dragonslave.de/pub_key.asc

Attachment: signature.asc
Description: OpenPGP digital signature