Re: [PATCH 2/3] arm64: lib: improve copy performance when size is ge 128 bytes

From: Robin Murphy
Date: Tue Mar 23 2021 - 10:29:36 EST

Next message: lyl2019: "Re: Re: [PATCH] thunderbolt: Fix a double put in tb_cfg_read_raw"
Previous message: David Hildenbrand: "Re: [PATCH v2 1/5] mm/migrate.c: remove unnecessary VM_BUG_ON_PAGE on putback_movable_page()"
In reply to: Will Deacon: "Re: [PATCH 2/3] arm64: lib: improve copy performance when size is ge 128 bytes"
Next in thread: Catalin Marinas: "Re: [PATCH 2/3] arm64: lib: improve copy performance when size is ge 128 bytes"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 2021-03-23 13:32, Will Deacon wrote:

On Tue, Mar 23, 2021 at 12:08:56PM +0000, Robin Murphy wrote:

On 2021-03-23 07:34, Yang Yingliang wrote:

When copy over 128 bytes, src/dst is added after
each ldp/stp instruction, it will cost more time.
To improve this, we only add src/dst after load
or store 64 bytes.

This breaks the required behaviour for copy_*_user(), since the fault
handler expects the base address to be up-to-date at all times. Say you're
copying 128 bytes and fault on the 4th store, it should return 80 bytes not
copied; the code below would return 128 bytes not copied, even though 48
bytes have actually been written to the destination.

We've had a couple of tries at updating this code (because the whole
template is frankly a bit terrible, and a long way from the well-optimised
code it was derived from), but getting the fault-handling behaviour right
without making the handler itself ludicrously complex has proven tricky. And
then it got bumped down the priority list while the uaccess behaviour in
general was in flux - now that the dust has largely settled on that I should
probably try to find time to pick this up again...

I think the v5 from Oli was pretty close, but it didn't get any review:

https://lore.kernel.org/r/20200914151800.2270-1-oli.swede@xxxxxxx

he also included tests:

https://lore.kernel.org/r/20200916104636.19172-1-oli.swede@xxxxxxx

It would be great if you or somebody else has time to revive those!

Yeah, we still have a ticket open for it. Since the uaccess overhaul has pretty much killed off any remaining value in the template idea, I might have a quick go at spinning a series to just update memcpy() and the other library routines to their shiny new versions, then come back and work on some dedicated usercopy routines built around LDTR/STTR (and the desired fault behaviour) as a follow-up.

(I was also holding out hope for copy_in_user() to disappear if we wait long enough, but apparently not yet...)

Robin.

Next message: lyl2019: "Re: Re: [PATCH] thunderbolt: Fix a double put in tb_cfg_read_raw"
Previous message: David Hildenbrand: "Re: [PATCH v2 1/5] mm/migrate.c: remove unnecessary VM_BUG_ON_PAGE on putback_movable_page()"
In reply to: Will Deacon: "Re: [PATCH 2/3] arm64: lib: improve copy performance when size is ge 128 bytes"
Next in thread: Catalin Marinas: "Re: [PATCH 2/3] arm64: lib: improve copy performance when size is ge 128 bytes"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]