Re: [PATCH] r8169: inform about CLKRUN protocol issue when behind a CardBus bridge
From: Maciej S. Szmigiero
Date: Sun Sep 09 2018 - 23:27:13 EST
On 09.09.2018 17:09, David Miller wrote:
> From: "Maciej S. Szmigiero" <mail@xxxxxxxxxxxxxxxxxxxxx>
> Date: Thu, 6 Sep 2018 18:10:53 +0200
>
>> It turns out that at least some r8169 CardBus cards don't operate correctly
>> when CLKRUN protocol is enabled - the symptoms are recurring timeouts
>> during PHY reads / writes and a very high packet drop rate.
>> This is true of at least RTL8169sc/8110sc (XID 18000000) chip in
>> Sunrich C-160 CardBus NIC.
>>
>> Such behavior was observed on two separate laptops, the first one has
>> TI PCIxx12 CardBus bridge, while the second one has Ricoh RL5c476II.
>>
>> Setting CLKRUN_En bit in CONFIG 3 register via an EEPROM write didn't
>> improve things in either case (this is probably why it wasn't set by the
>> card manufacturer).
>> The only way to fix the issue was to disable the CLKRUN protocol either
>> in the CardBus bridge (only possible in the TI one) or in the southbridge.
>>
>> Since the problem takes some time to debug let's warn people that have
>> the suspect configuration (Conventional PCI r8169 NIC behind a CardBus
>> bridge) so they know what they can do if they encounter it.
>>
>> Signed-off-by: Maciej S. Szmigiero <mail@xxxxxxxxxxxxxxxxxxxxx>
>
> I don't know about this.
>
> Barking at the user in the kernel log about an obscure knob (which btw
> doesn't exist for all cardbus bridges without other patches you are
> posting elsewhere) is rarely effective.
>
> We should just disable clkrun automatically we know it causes problems.
Unfortunately, as you wrote above, this workaround is only available on
TI CardBus bridges (and I hope will be available for two Ricoh ones soon,
too), while for other CardBus bridges it is either not implemented or
not available at all.
So we can't reliably just turn it on automatically when needed.
BTW, it seems that my RTL8169 card isn't the only model affected.
In fact, the original CLKRUN protocol disabling workaround on TI bridges
was implemented in 2005 because somebody's RTL8139 also had this
problem: https://lkml.org/lkml/2005/2/5/129
The main reason I wanted to add this warning is to save people time
debugging this issue, as it is rather unobvious.
But if this solution is unacceptable then I hope at least this
description will pop out in search results when searching for some
related keywords.
Thanks,
Maciej