Re: [PATCH 0/3] resource: find_next_iomem_res() improvements

From: Nadav Amit
Date: Tue Jun 18 2019 - 13:47:14 EST

> On Jun 17, 2019, at 11:44 PM, Dan Williams <dan.j.williams@xxxxxxxxx> wrote:
> On Wed, Jun 12, 2019 at 9:59 PM Nadav Amit <namit@xxxxxxxxxx> wrote:
>> Running some microbenchmarks on dax keeps showing find_next_iomem_res()
>> as a place in which significant amount of time is spent. It appears that
>> in order to determine the cacheability that is required for the PTE,
>> lookup_memtype() is called, and this one traverses the resources list in
>> an inefficient manner. This patch-set tries to improve this situation.
> Let's just do this lookup once per device, cache that, and replay it
> to modified vmf_insert_* routines that trust the caller to already
> know the pgprot_values.

IIUC, one device can have multiple regions with different characteristics,
which require difference cachability. Apparently, that is the reason there
is a tree of resources. Please be more specific about where you want to
cache it, please.

Perhaps you want to cache the cachability-mode in vma->vm_page_prot (which I
see being done in quite a few cases), but I donât know the code well enough
to be certain that every vma should have a single protection and that it
should not change afterwards.