Re: [PATCH] kvmalloc: always use vmalloc if CONFIG_DEBUG_VM

From: Mikulas Patocka
Date: Tue Apr 24 2018 - 11:30:53 EST

On Tue, 24 Apr 2018, Michal Hocko wrote:

> On Mon 23-04-18 20:25:15, Mikulas Patocka wrote:
> > Fixing __vmalloc code
> > is easy and it doesn't require cooperation with maintainers.
> But it is a hack against the intention of the scope api.

It is not! You can fix __vmalloc now and you can convert the kernel to the
scope API in 4 years. It's not one way or the other.

> It also alows maintainers to not care about their broken code.

Most maintainers don't even know that it's broken. Out of 14 subsystems
using __vmalloc with GFP_NOIO/NOFS, only 2 realized that its
implementation is broken and implemented a workaround (me and the XFS

Misimplementing a function in a subtle and hard-to-notice way won't drive
developers away from using it.

> > > > He refuses 15-line patch to fix GFP_NOIO bug because he believes that in 4
> > > > years, the kernel will be refactored and GFP_NOIO will be eliminated. Why
> > > > does he have veto over this part of the code? I'd much rather argue with
> > > > people who have constructive comments about fixing bugs than with him.
> > >
> > > I didn't NACK the patch AFAIR. I've said it is not a good idea longterm.
> > > I would be much more willing to change my mind if you would back your
> > > patch by a real bug report. Hacks are acceptable when we have a real
> > > issue in hands. But if we want to fix potential issue then better make
> > > it properly.
> >
> > Developers should fix bugs in advance, not to wait until a crash hapens,
> > is analyzed and reported.
> I agree. But are those existing users broken in the first place? I have
> seen so many GFP_NOFS abuses that I would dare to guess that most of
> those vmalloc NOFS abusers can be simply turned into GFP_KERNEL. Maybe
> that is the reason we haven't heard any complains in years.

alloc_pages reclaims clean pages and most hard work is done by kswapd, so
GFP_KERNEL doesn't cause much issues with writeback. But cheating isn't
justified if you can get away with it. Incorrect GFP flags cause real
problems with shrinkers - because shrinkers are called from alloc_pages
and they do respond to GFP flags.

I had reported deadlock due to GFP issues (9d28eb12447). And the worst
thing about these bug reports is that they are totally unreproducible and
I get nothing, but a stacktrace in bugzilla. I had to guess what happened
and I couldn't even test if the patch fixed the bug.

I'm not really happy that you are deliberately leaving these issues behind
and making excuses.