Re: Realtek NIC uses over 1 Watt with no traffic

From: Heiner Kallweit
Date: Tue Nov 20 2018 - 17:29:55 EST


On 20.11.2018 23:25, Paul Menzel wrote:
> Dear Heiner,
>
>
> Am 20.11.18 um 22:06 schrieb Heiner Kallweit:
>> On 20.11.2018 21:31, Paul Menzel wrote:
>
> [â]
>
>>> Am 20.11.18 um 21:14 schrieb Heiner Kallweit:
>>>> On 20.11.2018 15:45, Andrew Lunn wrote:
>>>>> On Tue, Nov 20, 2018 at 09:40:25AM +0100, Paul Menzel wrote:
>>>
>>>>>> Using Ubuntu 18.10, Linux 4.18.0-11-generic, PowerTOP 2.9 shows, the NIC
>>>>>> uses 1.77 Watts. A network cable is plugged in, but there is no real traffic
>>>>>> according to `iftop`. Only an email program is running.
>>>>>>
>>>>>> ÂÂÂÂÂ $ lspci -nn -s 3:00.1
>>>>>> ÂÂÂÂÂ 03:00.1 Ethernet controller [0200]: Realtek Semiconductor Co., Ltd.
>>>>>> RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller [10ec:8168] (rev
>>>>>> 12)
>>>>>>
>>>>>> Is that a measurement error, or does the NIC really need that much power?
>>>
>>>>> This sounds like Energy Efficient Ethernet, EEE, is not enabled.
>>>>>
>>>>> What does ethtool --show-eee ethX say?
>>>
>>> ÂÂÂÂ $ sudo ethtool --show-eee enp3s0f1
>>> ÂÂÂÂ Cannot get EEE settings: Operation not supported
>>>
>>>> The r8169 driver doesn't support the get_eee ethtool_ops callback.
>>>> For certain chip versions EEE gets enabled in the PHY init, for others
>>>> not and some don't seem to support EEE at all.
>>>>
>>>> Apart from EEE one important factor affecting power consumption is ASPM.
>>>> This was recently enabled for certain chip versions.
>>>>
>>>> Information that would help:
>>>>
>>>> whether Wake-on-LAN is enabled ("Wake-on:" line from ethtool output)
>>>
>>> ```
>>> $ sudo ethtool enp3s0f1
>>> Settings for enp3s0f1:
>>> ÂÂÂÂÂSupported ports: [ TP AUI BNC MII FIBRE ]
>>> ÂÂÂÂÂSupported link modes:ÂÂ 10baseT/Half 10baseT/Full
>>> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ 100baseT/Half 100baseT/Full
>>> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ 1000baseT/Full
>>> ÂÂÂÂÂSupported pause frame use: Symmetric Receive-only
>>> ÂÂÂÂÂSupports auto-negotiation: Yes
>>> ÂÂÂÂÂSupported FEC modes: Not reported
>>> ÂÂÂÂÂAdvertised link modes:Â 10baseT/Half 10baseT/Full
>>> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ 100baseT/Half 100baseT/Full
>>> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ 1000baseT/Full
>>> ÂÂÂÂÂAdvertised pause frame use: Symmetric Receive-only
>>> ÂÂÂÂÂAdvertised auto-negotiation: Yes
>>> ÂÂÂÂÂAdvertised FEC modes: Not reported
>>> ÂÂÂÂÂLink partner advertised link modes:Â 10baseT/Half 10baseT/Full
>>> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ 100baseT/Half 100baseT/Full
>>> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ 1000baseT/Full
>>> ÂÂÂÂÂLink partner advertised pause frame use: Symmetric
>>> ÂÂÂÂÂLink partner advertised auto-negotiation: Yes
>>> ÂÂÂÂÂLink partner advertised FEC modes: Not reported
>>> ÂÂÂÂÂSpeed: 1000Mb/s
>>> ÂÂÂÂÂDuplex: Full
>>> ÂÂÂÂÂPort: MII
>>> ÂÂÂÂÂPHYAD: 0
>>> ÂÂÂÂÂTransceiver: internal
>>> ÂÂÂÂÂAuto-negotiation: on
>>> ÂÂÂÂÂSupports Wake-on: pumbg
>>> ÂÂÂÂÂWake-on: g
>>> ÂÂÂÂÂCurrent message level: 0x00000033 (51)
>>> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ drv probe ifdown ifup
>>> ÂÂÂÂÂLink detected: yes
>>> ```
>>>
>>> So, itâs enabled (g Wake on MagicPacket(tm)).
>>>
>>> Running `sudo ethtool -s enp3s0f1 wol d;` doesnât change anything though.
>>>
>>>> lspci -vv output for the Realtek NIC
>>>
>>> Here is the output (quoted, so that Thunderbird does not wrap the line).
>>>
>>>> $ sudo lspci -vv -s 3:00.1
>>>> 03:00.1 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 12)
>>>> ÂÂÂÂÂSubsystem: CLEVO/KAPOK Computer RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller
>>>> ÂÂÂÂÂControl: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
>>>> ÂÂÂÂÂStatus: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
>>>> ÂÂÂÂÂLatency: 0, Cache Line Size: 64 bytes
>>>> ÂÂÂÂÂInterrupt: pin A routed to IRQ 19
>>>> ÂÂÂÂÂRegion 0: I/O ports at e000 [size=256]
>>>> ÂÂÂÂÂRegion 2: Memory at df114000 (64-bit, non-prefetchable) [size=4K]
>>>> ÂÂÂÂÂRegion 4: Memory at df110000 (64-bit, non-prefetchable) [size=16K]
>>>> ÂÂÂÂÂCapabilities: [40] Power Management version 3
>>>> ÂÂÂÂÂÂÂÂ Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=375mA PME(D0+,D1+,D2+,D3hot+,D3cold+)
>>>> ÂÂÂÂÂÂÂÂ Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
>>>> ÂÂÂÂÂCapabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit+
>>>> ÂÂÂÂÂÂÂÂ Address: 0000000000000000Â Data: 0000
>>>> ÂÂÂÂÂCapabilities: [70] Express (v2) Endpoint, MSI 01
>>>> ÂÂÂÂÂÂÂÂ DevCap:ÂÂÂ MaxPayload 128 bytes, PhantFunc 0, Latency L0s <512ns, L1 <64us
>>>> ÂÂÂÂÂÂÂÂÂÂÂÂ ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset- SlotPowerLimit 10.000W
>>>> ÂÂÂÂÂÂÂÂ DevCtl:ÂÂÂ Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
>>>> ÂÂÂÂÂÂÂÂÂÂÂÂ RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop-
>>>> ÂÂÂÂÂÂÂÂÂÂÂÂ MaxPayload 128 bytes, MaxReadReq 4096 bytes
>>>> ÂÂÂÂÂÂÂÂ DevSta:ÂÂÂ CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr+ TransPend-
>>>> ÂÂÂÂÂÂÂÂ LnkCap:ÂÂÂ Port #0, Speed 2.5GT/s, Width x1, ASPM L0s L1, Exit Latency L0s unlimited, L1 <64us
>>>> ÂÂÂÂÂÂÂÂÂÂÂÂ ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+
>>>> ÂÂÂÂÂÂÂÂ LnkCtl:ÂÂÂ ASPM L1 Enabled; RCB 64 bytes Disabled- CommClk+
>>
>> L0s is missing here, no idea why.
>
> Indeed. Iâll forward that to TUXEDO.
>
>>>> ÂÂÂÂÂÂÂÂÂÂÂÂ ExtSynch- ClockPM+ AutWidDis- BWInt- AutBWInt-
>>>> ÂÂÂÂÂÂÂÂ LnkSta:ÂÂÂ Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
>>>> ÂÂÂÂÂÂÂÂ DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR+, OBFF Via message/WAKE#
>>>> ÂÂÂÂÂÂÂÂ DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
>>>> ÂÂÂÂÂÂÂÂ LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-
>>>> ÂÂÂÂÂÂÂÂÂÂÂÂÂ EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
>>>> ÂÂÂÂÂCapabilities: [b0] MSI-X: Enable+ Count=4 Masked-
>>>> ÂÂÂÂÂÂÂÂ Vector table: BAR=4 offset=00000000
>>>> ÂÂÂÂÂÂÂÂ PBA: BAR=4 offset=00000800
>>>> ÂÂÂÂÂCapabilities: [d0] Vital Product Data
>>>> pcilib: sysfs_read_vpd: read failed: Input/output error
>>>> ÂÂÂÂÂÂÂÂ Not readable
>>>> ÂÂÂÂÂCapabilities: [100 v2] Advanced Error Reporting
>>>> ÂÂÂÂÂÂÂÂ UESta:ÂÂÂ DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
>>>> ÂÂÂÂÂÂÂÂ UEMsk:ÂÂÂ DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
>>>> ÂÂÂÂÂÂÂÂ UESvrt:ÂÂÂ DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
>>>> ÂÂÂÂÂÂÂÂ CESta:ÂÂÂ RxErr+ BadTLP+ BadDLLP+ Rollover- Timeout+ NonFatalErr+
>>>> ÂÂÂÂÂÂÂÂ CEMsk:ÂÂÂ RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
>>>> ÂÂÂÂÂÂÂÂ AERCap:ÂÂÂ First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
>>>> ÂÂÂÂÂCapabilities: [160 v1] Device Serial Number 01-00-00-00-68-4c-e0-00
>>>> ÂÂÂÂÂCapabilities: [170 v1] Latency Tolerance Reporting
>>>> ÂÂÂÂÂÂÂÂ Max snoop latency: 3145728ns
>>>> ÂÂÂÂÂÂÂÂ Max no snoop latency: 3145728ns
>>>> ÂÂÂÂÂCapabilities: [178 v1] L1 PM Substates
>>>> ÂÂÂÂÂÂÂÂ L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+
>>>> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂ PortCommonModeRestoreTime=150us PortTPowerOnTime=150us
>>>> ÂÂÂÂÂÂÂÂ L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2- ASPM_L1.1-
>>>> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ T_CommonMode=0us LTR1.2_Threshold=0ns
>>>> ÂÂÂÂÂÂÂÂ L1SubCtl2: T_PwrOn=10us
>>>> ÂÂÂÂÂKernel driver in use: r8169
>>>> ÂÂÂÂÂKernel modules: r8169
>>>
>>> Some Active State Power Management levels seem to be enabled.
>>>
>>>> Info from powertop about package C states. With ASPM my system reaches
>>>> 50% PC7 + 50% PC10.
>>>
>>> That seems to be the case on my TUXEDO Book BU1406 too.
>>>
>>>> ÂÂÂÂÂÂÂÂÂÂ PaketÂÂÂÂ |ÂÂÂÂÂÂÂÂÂÂÂÂ KernÂÂÂ |ÂÂÂÂÂÂÂÂÂÂÂ CPU 0ÂÂÂÂÂÂ CPU 2
>>>> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ |ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ | C0 aktivÂÂÂ 1,7%ÂÂÂÂÂÂÂ 1,1%
>>>>  | | POLL 0,0% 0,0 ms 0,0% 0,0 ms
>>>>  | | C1E 0,2% 0,8 ms 0,1% 0,2 ms
>>>> C2 (pc2)ÂÂÂ 5,2%ÂÂÂ |ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ |
>>>> C3 (pc3) 82,1% | C3 (cc3) 0,0% | C3 0,0% 0,2 ms 0,1% 0,2 ms
>>
>> Relevant are the package states and your system reaches pc3 only. The "Tunables" section
>> in powertop may provide hints how to save more power.
>
> Thank you for the hint. As itâs unrelated, Iâll just paste the tunables below, but will try to forward it to the correct people.
>
> ÂÂÂ SchlechtÂÂÂÂÂ Audiocodec-Energieverwaltung einschalten
> ÂÂÂ SchlechtÂÂÂÂÂ VM-RÃckschreibezeitlimit
>
>>>> C6 (pc6) 0,0% | C6 (cc6) 1,3% | C6 0,8% 0,5 ms 1,4% 0,6 ms
>>>> C7 (pc7) 0,0% | C7 (cc7) 90,8% | C7s 0,0% 1,6 ms 0,0% 0,0 ms
>>>> C8 (pc8)ÂÂÂ 0,0%ÂÂÂ |ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ | C8ÂÂÂÂÂÂÂÂÂ 6,0%ÂÂÂ 1,8 ms 10,1%ÂÂÂ 2,0 ms
>>>> C9 (pc9) 0,0% | | C9 0,2% 2,8 ms 0,2% 2,9 ms
>>>> C10 (pc10)Â 0,0%ÂÂÂ |ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ | C10ÂÂÂÂÂÂÂ 88,7%ÂÂ 12,7 ms 84,4%ÂÂ 14,9 ms
>>>>
>>>> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ |ÂÂÂÂÂÂÂÂÂÂÂÂ KernÂÂÂ |ÂÂÂÂÂÂÂÂÂÂÂ CPU 1ÂÂÂÂÂÂ CPU 3
>>>> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ |ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ | C0 aktivÂÂÂ 1,0%ÂÂÂÂÂÂÂ 0,8%
>>>>  | | POLL 0,0% 0,0 ms 0,0% 0,0 ms
>>>>  | | C1E 0,1% 0,3 ms 0,1% 0,3 ms
>>>> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ |ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ |
>>>>  | C3 (cc3) 0,0% | C3 0,0% 0,2 ms 0,0% 0,2 ms
>>>>  | C6 (cc6) 1,1% | C6 0,9% 0,6 ms 0,8% 0,5 ms
>>>>  | C7 (cc7) 92,2% | C7s 0,0% 1,7 ms 0,0% 0,0 ms
>>>>  | | C8 6,2% 1,7 ms 5,4% 1,7 ms
>>>>  | | C9 0,3% 1,7 ms 0,1% 1,9 ms
>>>> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ |ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ | C10ÂÂÂÂÂÂÂ 88,8%ÂÂ 12,1 ms 90,7%ÂÂ 14,8 ms
>>>>
>>>> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ |ÂÂÂÂÂÂÂÂÂÂÂÂ GPUÂÂÂÂ |
>>>> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ |ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ |
>>>>  | Powered On 2,2% |
>>>> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ | RC6ÂÂÂÂÂÂÂ 97,8%ÂÂÂ |
>>>> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ | RC6pÂÂÂÂÂÂÂ 0,0%ÂÂÂ |
>>>> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ | RC6ppÂÂÂÂÂÂ 0,0%ÂÂÂ |
>>>
>>>> dmesg output filtered for "r8169". Primarily relevant is the line with
>>>> the chip name and XID.
>>>
>>> Please find them below.
>>>
>>>> $ sudo dmesg | grep r8169
>>>> [ 5.318442] calling rtl8169_pci_driver_init+0x0/0x1000 [r8169] @ 418
>>>> [ÂÂÂ 5.318470] r8169 0000:03:00.1: enabling device (0000 -> 0003)
>>>> [ÂÂÂ 5.340324] libphy: r8169: probed
>>>> [ÂÂÂ 5.340630] r8169 0000:03:00.1 eth0: RTL8411, 80:fa:5b:3b:dd:f0, XID 5c800800, IRQ 136
>>
>> Good to know. For this chip version rtl8168g_2_hw_phy_config() is used to configure the PHY,
>> but this function just loads the firmware. So we don't know whether EEE is enabled.
>>
>> What you could do to test further is limiting the speed to 100MBit or 10MBit via ethtool.
>> If this reduces power consumption significantly it's a hint that indeed the PHY seems
>> to be the one to be blamed.
>
> With `sudo ethtool -s enp3s0f1 speed 10 duplex full` the power usage drops to 800 mW and even to 0, so itâs much less as with 1 Gbit/s.
>
OK, so Andrew was right and the issue seems to be the disabled EEE.
I'll set this on my agenda. Most likely in step 1 you'll have to use
ethtool to switch on EEE, in step 2 EEE will be enabled per default
for this chip version.

>>>> [ÂÂÂ 5.340632] r8169 0000:03:00.1 eth0: jumbo features [frames: 9200 bytes, tx checksumming: ko]
>>>> [ÂÂÂ 5.340673] initcall rtl8169_pci_driver_init+0x0/0x1000 [r8169] returned 0 after 9217 usecs
>>>> [ÂÂÂ 5.799967] r8169 0000:03:00.1 enp3s0f1: renamed from eth0
>>>> [ÂÂ 10.036968] Generic PHY r8169-301:00: attached PHY driver [Generic PHY] (mii_bus:phy_addr=r8169-301:00, irq=IGNORE)
>>>> [ 676.940934] calling rtl8169_pci_driver_init+0x0/0x1000 [r8169] @ 22235
>>>> [Â 676.952411] libphy: r8169: probed
>>>> [Â 676.952701] r8169 0000:03:00.1 eth0: RTL8411, 80:fa:5b:3b:dd:f0, XID 5c800800, IRQ 139
>>>> [Â 676.952702] r8169 0000:03:00.1 eth0: jumbo features [frames: 9200 bytes, tx checksumming: ko]
>>>> [Â 676.952736] initcall rtl8169_pci_driver_init+0x0/0x1000 [r8169] returned 0 after 11518 usecs
>>>> [Â 676.954420] r8169 0000:03:00.1 enp3s0f1: renamed from eth0
>>>> [Â 676.975161] Generic PHY r8169-301:00: attached PHY driver [Generic PHY] (mii_bus:phy_addr=r8169-301:00, irq=IGNORE)
>>>> [Â 680.518923] r8169 0000:03:00.1 enp3s0f1: Link is Up - 1Gbps/Full - flow control rx/tx
>>>> [ 1751.285899] r8169 0000:03:00.1: invalid short VPD tag 00 at offset 1
>
>
> Kind regards,
>
> Paul
>