Re: [Intel-gfx] [PATCH 2/5] kernel.h: Add non_block_start/end()
From: Daniel Vetter
Date: Fri Aug 16 2019 - 02:21:18 EST
On Fri, Aug 16, 2019 at 3:00 AM Jason Gunthorpe <jgg@xxxxxxxx> wrote:
> On Thu, Aug 15, 2019 at 10:49:31PM +0200, Daniel Vetter wrote:
> > On Thu, Aug 15, 2019 at 10:27 PM Jason Gunthorpe <jgg@xxxxxxxx> wrote:
> > > On Thu, Aug 15, 2019 at 10:16:43PM +0200, Daniel Vetter wrote:
> > > > So if someone can explain to me how that works with lockdep I can of
> > > > course implement it. But afaics that doesn't exist (I tried to explain
> > > > that somewhere else already), and I'm no really looking forward to
> > > > hacking also on lockdep for this little series.
> > >
> > > Hmm, kind of looks like it is done by calling preempt_disable()
> >
> > Yup. That was v1, then came the suggestion that disabling preemption
> > is maybe not the best thing (the oom reaper could still run for a long
> > time comparatively, if it's cleaning out gigabytes of process memory
> > or what not, hence this dedicated debug infrastructure).
>
> Oh, I'm coming in late, sorry
>
> Anyhow, I was thinking since we agreed this can trigger on some
> CONFIG_DEBUG flag, something like
>
> /* This is a sleepable region, but use preempt_disable to get debugging
> * for calls that are not allowed to block for OOM [.. insert
> * Michal's explanation.. ] */
> if (IS_ENABLED(CONFIG_DEBUG_ATOMIC_SLEEP) && !mmu_notifier_range_blockable(range))
> preempt_disable();
> ops->invalidate_range_start();
I think we also discussed that, and some expressed concerns it would
change behaviour/timing too much for testing. Since this does does
disable preemption for real, not just for might_sleep.
> And I have also been idly mulling doing something like
>
> if (IS_ENABLED(CONFIG_DEBUG_NOTIFIERS) &&
> rand &&
> mmu_notifier_range_blockable(range)) {
> range->flags = 0
> if (!ops->invalidate_range_start(range))
> continue
>
> // Failed, try again as blockable
> range->flags = MMU_NOTIFIER_RANGE_BLOCKABLE
> }
> ops->invalidate_range_start(range);
>
> Which would give coverage for this corner case without forcing OOM.
Hm, this sounds like a neat idea to slap on top. The rand is going to
be a bit tricky though, but I guess for this we could stuff another
counter into task_struct and just e.g. do this every 1000th or so
invalidate (well need to pick a prime so we cycle through notifiers in
case there's multiple). I like.
Michal, thoughts?
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch