Re: [PATCHv3, RFC 31/34] thp: initial implementation ofdo_huge_linear_fault()

From: Kirill A. Shutemov
Date: Thu Apr 18 2013 - 12:36:42 EST


Dave Hansen wrote:
> On 04/18/2013 09:09 AM, Kirill A. Shutemov wrote:
> > Dave Hansen wrote:
> >> On 04/17/2013 07:38 AM, Kirill A. Shutemov wrote:
> >> Are you still sure you can't do _any_ better than a verbatim copy of 129
> >> lines?
> >
> > It seems I was too lazy. Shame on me. :(
> > Here's consolidated version. Only build tested. Does it look better?
>
> Yeah, it's definitely a step in the right direction. There rae
> definitely some bugs in there like:
>
> + unsigned long haddr = address & PAGE_MASK;

It's not bug. It's bad name for the variable.
See, first 'if (try_huge_pages)'. I update it there for huge page case.

addr_aligned better?

>
> I do think some of this refactoring stuff
>
> > - unlock_page(page);
> > - vmf.flags = FAULT_FLAG_WRITE|FAULT_FLAG_MKWRITE;
> > - tmp = vma->vm_ops->page_mkwrite(vma, &vmf);
> > - if (unlikely(tmp &
> > - (VM_FAULT_ERROR | VM_FAULT_NOPAGE))) {
> > - ret = tmp;
> > + unlock_page(page);
> > + vmf.flags = FAULT_FLAG_WRITE | FAULT_FLAG_MKWRITE;
> > + tmp = vma->vm_ops->page_mkwrite(vma, &vmf);
> > + if (unlikely(tmp &
> > + (VM_FAULT_ERROR | VM_FAULT_NOPAGE))) {
> > + ret = tmp;
> > + goto unwritable_page;
> > + }
>
> could probably be a separate patch and would make what's going on more
> clear, but it's passable the way it is. When it is done this way it's
> hard sometimes reading the diff to realize if you are adding code or
> just moving it around.

Will do.

>
> This stuff:
>
> > if (set_page_dirty(dirty_page))
> > - dirtied = 1;
> > + dirtied = true;
>
> needs to go in another patch for sure.

Ditto.

> One thing I *REALLY* like about doing patches this way is that things
> like this start to pop out:
>
> > - ret = vma->vm_ops->fault(vma, &vmf);
> > + if (try_huge_pages) {
> > + pgtable = pte_alloc_one(mm, haddr);
> > + if (unlikely(!pgtable)) {
> > + ret = VM_FAULT_OOM;
> > + goto uncharge_out;
> > + }
> > + ret = vma->vm_ops->huge_fault(vma, &vmf);
> > + } else
> > + ret = vma->vm_ops->fault(vma, &vmf);
>
> The ->fault is (or can be) essentially per filesystem, and we're going
> to be adding support per-filesystem. any reason we can't just handle
> this inside the ->fault code and avoid adding huge_fault altogether?

will check. it's on my todo list already.

--
Kirill A. Shutemov
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/