Re: [PATCH] mm, oom: distinguish blockable mode for mmu notifiers

From: Michal Hocko
Date: Fri Aug 24 2018 - 09:32:16 EST

On Fri 24-08-18 22:02:23, Tetsuo Handa wrote:
> On 2018/08/24 20:36, Michal Hocko wrote:
> >> That is, this API seems to be currently used by only out-of-tree users. Since
> >> we can't check that nobody has memory allocation dependency, I think that
> >> hmm_invalidate_range_start() should return -EAGAIN if blockable == false for now.
> >
> > The code expects that the invalidate_range_end doesn't block if
> > invalidate_range_start hasn't blocked. That is the reason why the end
> > callback doesn't have blockable parameter. If this doesn't hold then the
> > whole scheme is just fragile because those two calls should pair.
> >
> That is
> More worrisome part in that patch is that I don't know whether using
> trylock if blockable == false at entry is really sufficient.
> . Since those two calls should pair, I think that we need to determine whether
> we need to return -EAGAIN at start call by evaluating both calls.

Yes, and I believe I have done that audit. Module my misunderstanding of
the code.

> Like mn_invl_range_start() involves schedule_delayed_work() which could be
> blocked on memory allocation under OOM situation,

It doesn't because that code path is not invoked for the !blockable

> I worry that (currently
> out-of-tree) users of this API are involving work / recursion.

I do not give a slightest about out-of-tree modules. They will have to
accomodate to the new API. I have no problems to extend the
documentation and be explicit about this expectation.
diff --git a/include/linux/mmu_notifier.h b/include/linux/mmu_notifier.h
index 133ba78820ee..698e371aafe3 100644
--- a/include/linux/mmu_notifier.h
+++ b/include/linux/mmu_notifier.h
@@ -153,7 +153,9 @@ struct mmu_notifier_ops {
* If blockable argument is set to false then the callback cannot
* sleep and has to return with -EAGAIN. 0 should be returned
- * otherwise.
+ * otherwise. Please note that if invalidate_range_start approves
+ * a non-blocking behavior then the same applies to
+ * invalidate_range_end.
int (*invalidate_range_start)(struct mmu_notifier *mn,

> And hmm_release() says that
> /*
> * Drop mirrors_sem so callback can wait on any pending
> * work that might itself trigger mmu_notifier callback
> * and thus would deadlock with us.
> */
> and keeps "all operations protected by hmm->mirrors_sem held for write are
> atomic". This suggests that "some operations protected by hmm->mirrors_sem held
> for read will sleep (and in the worst case involves memory allocation
> dependency)".

Yes and so what? The clear expectation is that neither of the range
notifiers do not sleep in !blocking mode. I really fail to see what you
are trying to say.

Michal Hocko