Re: [PATCH 3/3] dax: Handle write faults more efficiently

From: Andy Lutomirski
Date: Wed Jan 27 2016 - 00:23:44 EST


On Tue, Jan 26, 2016 at 8:17 PM, Matthew Wilcox <willy@xxxxxxxxxxxxxxx> wrote:
> On Mon, Jan 25, 2016 at 09:38:19AM -0800, Andy Lutomirski wrote:
>> On Mon, Jan 25, 2016 at 9:25 AM, Matthew Wilcox
>> <matthew.r.wilcox@xxxxxxxxx> wrote:
>> > From: Matthew Wilcox <willy@xxxxxxxxxxxxxxx>
>> >
>> > When we handle a write-fault on a DAX mapping, we currently insert a
>> > read-only mapping and then take the page fault again to convert it to
>> > a writable mapping. This is necessary for the case where we cover a
>> > hole with a read-only zero page, but when we have a data block already
>> > allocated, it is inefficient.
>> >
>> > Use the recently added vmf_insert_pfn_prot() to insert a writable mapping,
>> > even though the default VM flags say to use a read-only mapping.
>>
>> Conceptually, I like this. Do you need to make sure to do all the
>> do_wp_page work, though? (E.g. we currently update mtime in there.
>> Some day I'll fix that, but it'll be replaced with a set_bit to force
>> a deferred mtime update.)
>
> We update mtime in the ->fault handler of filesystems which support DAX
> like this:
>
> if (vmf->flags & FAULT_FLAG_WRITE) {
> sb_start_pagefault(inode->i_sb);
> file_update_time(vma->vm_file);
> }
>
> so I think we're covered.

Sounds good.

On second reading, though, what ensures that the vm is
VM_WRITE|VM_SHARED? If nothing else, some nice comments might help.

A WARN_ON_ONCE that the pgprot you're starting with is RO would be
nice if there's a generic way to do that. Actually, having a generic
pgprot_writable could make this less ugly.

Also, this optimization could be generalized, albeit a bit slower, by
having handle_pte_fault check if the inserted pte is read-only for a
write fault and continuing down the function to the wp_page logic.
After all, returning back to the arch entry code and retrying the
fault the old fashioned way is both very slow and has an outcome
that's known in advance.

--Andy