Re: [Bug fix]mips 64bits checksum error -- csum_tcpudp_nofold

From: Ralf Baechle
Date: Thu Jan 26 2017 - 11:13:40 EST


On Thu, Jan 26, 2017 at 07:57:49AM -0800, Alexander Duyck wrote:
> Date: Thu, 26 Jan 2017 07:57:49 -0800
> From: Alexander Duyck <alexander.duyck@xxxxxxxxx>
> To: Mark Zhang <bomb.zhang@xxxxxxxxx>
> Cc: Ralf Baechle <ralf@xxxxxxxxxxxxxx>, David Miller <davem@xxxxxxxxxxxxx>,
> Alexander Duyck <aduyck@xxxxxxxxxxxx>, linux-mips@xxxxxxxxxxxxxx,
> "linux-kernel@xxxxxxxxxxxxxxx" <linux-kernel@xxxxxxxxxxxxxxx>
> Subject: Re: [Bug fix]mips 64bits checksum error -- csum_tcpudp_nofold
> Content-Type: text/plain; charset=UTF-8
>
> On Wed, Jan 25, 2017 at 6:33 PM, Mark Zhang <bomb.zhang@xxxxxxxxx> wrote:
> > Hi Alex,
> >
> > Thanks for your reply.
> > I tested your correction. The result is correct.
> > The C language will cause this function(csum_tcpudp_nofold) become
> > 176 MIPS instructions. The assemble code is 150 MIPS instruction.
> > If the MIPS CPU is running as 1GHz, C language cause more 60 nano
> > seconds during send/receive each tcp/udp packet. I'm not sure whether
> > it will cause any negative result if the frequency of CPU was lower.
> > MIPS arch is usually used in networking equipments.
> > I think Ralf's correction is better.
> >
> > - Mark
> >
> > Signed-off-by: Ralf Baechle <ralf@xxxxxxxxxxxxxx>
> >
> > arch/mips/include/asm/checksum.h | 2 ++
> > 1 file changed, 2 insertions(+)
> >
> > diff --git a/arch/mips/include/asm/checksum.h b/arch/mips/include/asm/checksum.h
> > index 5c585c5..08b6ba3 100644
> > --- a/arch/mips/include/asm/checksum.h
> > +++ b/arch/mips/include/asm/checksum.h
> > @@ -186,7 +186,9 @@ static inline __wsum csum_tcpudp_nofold(__be32 saddr,
> > " daddu %0, %4 \n"
> > " dsll32 $1, %0, 0 \n"
> > " daddu %0, $1 \n"
> > + " sltu $1, %0, $1 \n"
> > " dsra32 %0, %0, 0 \n"
> > + " daddu %0, $1 \n"
> > #endif
> > " .set pop"
> > : "=r" (sum)
> >
>
> This looks good to me.
>
> Acked-by: Alexander Duyck <alexander.h.duyck@xxxxxxxxx>

I've actually checked in a slightly different version that this which
uses an ADDU rather than a DADDU for the second instruction added. This
is because the DSRA32 ensures the 32 bit result in %0 is properly
signed extended to 64 bit as required by the MIPS architecture and the
ADDU then simply operates on that 32 bit %0.

Ralf