Re: [PATCH] mm, oom: distinguish blockable mode for mmu notifiers

From: Michal Hocko
Date: Fri Aug 24 2018 - 07:36:35 EST


On Fri 24-08-18 19:54:19, Tetsuo Handa wrote:
[...]
> > --- a/mm/hmm.c
> > +++ b/mm/hmm.c
> > @@ -177,16 +177,19 @@ static void hmm_release(struct mmu_notifier *mn, struct mm_struct *mm)
> > up_write(&hmm->mirrors_sem);
> > }
> >
> > -static void hmm_invalidate_range_start(struct mmu_notifier *mn,
> > +static int hmm_invalidate_range_start(struct mmu_notifier *mn,
> > struct mm_struct *mm,
> > unsigned long start,
> > - unsigned long end)
> > + unsigned long end,
> > + bool blockable)
> > {
> > struct hmm *hmm = mm->hmm;
> >
> > VM_BUG_ON(!hmm);
> >
> > atomic_inc(&hmm->sequence);
> > +
> > + return 0;
> > }
> >
> > static void hmm_invalidate_range_end(struct mmu_notifier *mn,
>
> This assumes that hmm_invalidate_range_end() does not have memory
> allocation dependency. But hmm_invalidate_range() from
> hmm_invalidate_range_end() involves
>
> down_read(&hmm->mirrors_sem);
> list_for_each_entry(mirror, &hmm->mirrors, list)
> mirror->ops->sync_cpu_device_pagetables(mirror, action,
> start, end);
> up_read(&hmm->mirrors_sem);
>
> sequence. What is surprising is that there is no in-tree user who assigns
> sync_cpu_device_pagetables field.

Yes HMM doesn't have any in tree user yet.

> $ grep -Fr sync_cpu_device_pagetables *
> Documentation/vm/hmm.rst: /* sync_cpu_device_pagetables() - synchronize page tables
> include/linux/hmm.h: * will get callbacks through sync_cpu_device_pagetables() operation (see
> include/linux/hmm.h: /* sync_cpu_device_pagetables() - synchronize page tables
> include/linux/hmm.h: void (*sync_cpu_device_pagetables)(struct hmm_mirror *mirror,
> include/linux/hmm.h: * hmm_mirror_ops.sync_cpu_device_pagetables() callback, so that CPU page
> mm/hmm.c: mirror->ops->sync_cpu_device_pagetables(mirror, action,
>
> That is, this API seems to be currently used by only out-of-tree users. Since
> we can't check that nobody has memory allocation dependency, I think that
> hmm_invalidate_range_start() should return -EAGAIN if blockable == false for now.

The code expects that the invalidate_range_end doesn't block if
invalidate_range_start hasn't blocked. That is the reason why the end
callback doesn't have blockable parameter. If this doesn't hold then the
whole scheme is just fragile because those two calls should pair.

--
Michal Hocko
SUSE Labs