Re: [PATCH] x86: add back the alignment of the destination to 8 bytes in copy_user_generic()
From: Herton Krzesinski
Date: Mon Mar 17 2025 - 09:18:58 EST
On Sun, Mar 16, 2025 at 8:10 AM Ingo Molnar <mingo@xxxxxxxxxx> wrote:
>
>
> * Ingo Molnar <mingo@xxxxxxxxxx> wrote:
>
> > > It does look good in my testing here, I built same kernel I was
> > > using for testing the original patch (based on 6.14-rc6), this is
> > > one of the results I got in one of the runs testing on the same
> > > machine:
> > >
> > > CPU RATE SYS TIME sender-receiver
> > > Server bind 19: 20.8Gbits/sec 14.832313000 20.863476111 75.4%-89.2%
> > > Server bind 21: 18.0Gbits/sec 18.705221000 23.996913032 80.8%-89.7%
> > > Server bind 23: 20.1Gbits/sec 15.331761000 21.536657212 75.0%-89.7%
> > > Server bind none: 24.1Gbits/sec 14.164226000 18.043132731 82.3%-87.1%
> > >
> > > There are still some variations between runs, which is expected as
> > > was the same when I tested my patch or in the not aligned case, but
> > > it's consistently better/higher than the no align case. Looks
> > > really it's sufficient to align for the higher than or equal 64
> > > bytes copy case.
> >
> > Mind sending a v2 patch with a changelog and these benchmark numbers
> > added in, and perhaps a Co-developed-by tag with Linus or so?
>
> BTW., if you have a test system available, it would be nice to test a
> server CPU in the Intel spectrum as well. (For completeness mostly, I'd
> not expect there to be as much alignment sensitivity.)
>
> The CPU you tested, AMD Epyc 7742 was launched ~6 years ago so it's
> still within the window of microarchitectures we care about. An Intel
> test would be nice from a similar timeframe as well. Older is probably
> better in this case, but not too old. :-)
>
> ( Note that the Intel test is not required to apply the fix IMO - we
> did change alignment patterns ~2 years ago in a5624566431d which
> regressed. )
Yes I'll work here to send a v2 and try to test on an Intel system, and compare.
>
> Thanks,
>
> Ingo
>