Re: [PATCH net] r8152: Fix broken RX checksums.

From: Mark Lord
Date: Sun Oct 30 2016 - 22:09:51 EST


On 16-10-30 08:57 PM, David Miller wrote:
> From: Mark Lord <mlord@xxxxxxxxx>
> Date: Sun, 30 Oct 2016 19:28:27 -0400
>
>> The r8152 driver has been broken since (approx) 3.16.xx
>> when support was added for hardware RX checksums
>> on newer chip versions. Symptoms include random
>> segfaults and silent data corruption over NFS.
>>
>> The hardware checksum logig does not work on the VER_02
>> dongles I have here when used with a slow embedded system CPU.
>> Google reveals others reporting similar issues on Raspberry Pi.
>>
>> So, disable hardware RX checksum support for VER_02, and fix
>> an obvious coding error for IPV6 checksums in the same function.
>>
>> Because this bug results in silent data corruption,
>> it is a good candidate for back-porting to -stable >= 3.16.xx.
>>
>> Signed-off-by: Mark Lord <mlord@xxxxxxxxx>
>
> Applied and queued up for -stable, thanks.

Thanks. Now that this is taken care of, I do wonder if perhaps
RX checksums ought to be enabled at all for ANY versions of this chip?

My theory is that the checksums probably work okay most of the time,
except when the hardware RX buffer overflows.

In my case, and in the case of the Raspberry Pi, the receiving CPU
is quite a bit slower than mainstream x86, so it can quite easily
fall behind in emptying the RX buffer on the chip.
The only indication this has happened may be an incorrect RX checksum.

This is only a theory, but I otherwise have trouble explaining
why we are seeing invalid RX checksums -- direct cable connections
to a switch, shared only with the NFS server. No reason for it
to have bad RX checksums in the first place.

Should we just blanket disable RX checksums for all versions here
unless proven otherwise/safe?

Anyone out there know better?

Cheers
Mark