Re: [PATCH v3] mm: fix tick timer stall during deferred page init

From: Michal Hocko
Date: Wed Apr 01 2020 - 12:47:19 EST


On Wed 01-04-20 12:41:13, Pavel Tatashin wrote:
> On Wed, Apr 1, 2020 at 12:26 PM Michal Hocko <mhocko@xxxxxxxxxx> wrote:
> >
> > On Wed 01-04-20 12:18:10, Daniel Jordan wrote:
> > > On Wed, Apr 01, 2020 at 06:12:43PM +0200, Michal Hocko wrote:
> > > > On Wed 01-04-20 12:09:29, Daniel Jordan wrote:
> > > > > On Wed, Apr 01, 2020 at 06:00:48PM +0200, Michal Hocko wrote:
> > > > > > On Wed 01-04-20 17:50:22, David Hildenbrand wrote:
> > > > > > > On 01.04.20 17:42, Michal Hocko wrote:
> > > > > > > > This needs a double checking but I strongly believe that the lock can be
> > > > > > > > simply dropped in this path.
> > > > >
> > > > > This is what my fix does, it limits the time the resize lock is held.
> > > >
> > > > Just remove it from the deferred intialization and add a comment that we
> > > > deliberately not taking the lock here because abc
> > >
> > > I think it has to be a little more involved because of the window where
> > > interrupts might allocate during deferred init, as Vlastimil pointed out a few
> > > years ago when the change was made.
> >
> > I do not remember any details but do we have any actual real allocation
> > failure or was this mostly a theoretical concern. Vlastimil? For your
> > context we are talking about 3a2d7fa8a3d5 ("mm: disable interrupts while
> > initializing deferred pages")
>
> I do not remember seeing any real failures, this was a theoretical
> window. So, we could potentially simply remove these locks until we
> see a real boot failure in some interrupt thread. The allocation has
> to be rather large as well.

Yes please! We are really great at over complicating and over
engineering stuff based on theoretical issues and build on top of that
and make the code even more complex because nobody dares to re-evaluate
and so on.

--
Michal Hocko
SUSE Labs