Re: [PATCH 1/2] KVM: async_pf: kill the unnecessary use_mm/unuse_mm async_pf_execute()

From: Andrea Arcangeli
Date: Mon Apr 28 2014 - 10:06:23 EST


Hi,

On Wed, Apr 23, 2014 at 09:32:28PM +0200, Oleg Nesterov wrote:
> On 04/22, Christian Borntraeger wrote:
> >
> > On 22/04/14 22:15, Christian Borntraeger wrote:
> > > On 21/04/14 15:25, Oleg Nesterov wrote:
> > >> async_pf_execute() has no reasons to adopt apf->mm, gup(current, mm)
> > >> should work just fine even if current has another or NULL ->mm.
> > >>
> > >> Recently kvm_async_page_present_sync() was added insedie the "use_mm"
> > >> section, but it seems that it doesn't need current->mm too.
> > >>
> > >> Signed-off-by: Oleg Nesterov <oleg@xxxxxxxxxx>
> > >
> > > Indeed, use/unuse_mm should only be necessary for copy_to/from_user etc.
> > > This is fine for s390, but it seems that x86 kvm_arch_async_page_not_present
> > > might call apf_put_user which might call copy_to_user, so this is not ok, I guess.
> >
> > wanted to say kvm_arch_async_page_not_present, but I have to correct myself.
> > x86 does the "page is there" in the cpu loop, not in the worker. The cpu look
> > d oes have a valid mm. So this patch should be also ok.
>
> Thanks ;)
>
> Btw, I forgot to mention this in the changelog, but
>
> > >> @@ -80,12 +80,10 @@ static void async_pf_execute(struct work_struct *work)
> > >>
> > >> might_sleep();
> > >>
> > >> - use_mm(mm);
> > >> down_read(&mm->mmap_sem);
> > >> get_user_pages(current, mm, addr, 1, 1, 0, NULL, NULL);
> > >> up_read(&mm->mmap_sem);
> > >> kvm_async_page_present_sync(vcpu, apf);
> > >> - unuse_mm(mm);
>
> it can actually do
>
> get_user_pages(NULL, mm, addr, 1, 1, 0, NULL, NULL);
>
> "task" is only used to increment task_struct->xxx_flt. I don't think
> async_pf_execute() actually needs this (current is PF_WQ_WORKER after
> all), but I didn't dare to do another change in the code I can hardly
> understand.

Considering the faults would be randomly distributed among the kworker
threads my preference would also be for NULL instead of current.

ptrace and uprobes tends to be the only two places that look into
other mm with gup, ptrace knows the exact pid that it is triggering
the fault into, so it also can specify the correct task so the fault
goes in the right task struct. uprobes uses NULL.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/