Re: [PATCH v5 2/5] allow mapping page-less memremaped areas into KVA

From: Dan Williams
Date: Thu Aug 13 2015 - 08:57:32 EST


On Wed, Aug 12, 2015 at 10:58 PM, Boaz Harrosh <boaz@xxxxxxxxxxxxx> wrote:
> On 08/13/2015 06:01 AM, Dan Williams wrote:
[..]
>> +void *kmap_atomic_pfn_t(__pfn_t pfn)
>> +{
>> + struct page *page = __pfn_t_to_page(pfn);
>> + resource_size_t addr;
>> + struct kmap *kmap;
>> +
>> + rcu_read_lock();
>> + if (page)
>> + return kmap_atomic(page);
>
> Right even with pages I pay rcu_read_lock(); for every access?
>
>> + addr = __pfn_t_to_phys(pfn);
>> + list_for_each_entry_rcu(kmap, &ranges, list)
>> + if (addr >= kmap->res->start && addr <= kmap->res->end)
>> + return kmap->base + addr - kmap->res->start;
>> +
>
> Good god! This loop is a real *joke*. You have just dropped memory access
> performance by 10 fold.
>
> The all point of pages and memory_model.h was to have a one to one
> relation-ships between Kernel-virtual vs physical vs page *
>
> There is already an object that holds a relationship of physical
> to Kernel-virtual. It is called a memory-section. Why not just
> widen its definition?
>
> If you are willing to accept this loop. In current Linux 2015 Kernel
> Then I have nothing farther to say.
>
> Boaz - go mourning for the death of the Linux Kernel alone in the corner ;-(
>

This is explicitly addressed in the changelog, repeated here:

> The __pfn_t to resource lookup is indeed inefficient walking of a linked list,
> but there are two mitigating factors:
>
> 1/ The number of persistent memory ranges is bounded by the number of
> DIMMs which is on the order of 10s of DIMMs, not hundreds.
>
> 2/ The lookup yields the entire range, if it becomes inefficient to do a
> kmap_atomic_pfn_t() a PAGE_SIZE at a time the caller can take
> advantage of the fact that the lookup can be amortized for all kmap
> operations it needs to perform in a given range.

DAX as is is races against pmem unbind. A synchronization cost must
be paid somewhere to make sure the memremap() mapping is still valid.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/