Re: [PATCH] kvmalloc: always use vmalloc if CONFIG_DEBUG_VM

From: Andrew Morton
Date: Thu Apr 19 2018 - 19:23:01 EST


On Thu, 19 Apr 2018 17:19:20 -0400 (EDT) Mikulas Patocka <mpatocka@xxxxxxxxxx> wrote:

> > > In order to detect these bugs reliably I submit this patch that changes
> > > kvmalloc to always use vmalloc if CONFIG_DEBUG_VM is turned on.
> > >
> > > ...
> > >
> > > --- linux-2.6.orig/mm/util.c 2018-04-18 15:46:23.000000000 +0200
> > > +++ linux-2.6/mm/util.c 2018-04-18 16:00:43.000000000 +0200
> > > @@ -395,6 +395,7 @@ EXPORT_SYMBOL(vm_mmap);
> > > */
> > > void *kvmalloc_node(size_t size, gfp_t flags, int node)
> > > {
> > > +#ifndef CONFIG_DEBUG_VM
> > > gfp_t kmalloc_flags = flags;
> > > void *ret;
> > >
> > > @@ -426,6 +427,7 @@ void *kvmalloc_node(size_t size, gfp_t f
> > > */
> > > if (ret || size <= PAGE_SIZE)
> > > return ret;
> > > +#endif
> > >
> > > return __vmalloc_node_flags_caller(size, node, flags,
> > > __builtin_return_address(0));
> >
> > Well, it doesn't have to be done at compile-time, does it? We could
> > add a knob (in debugfs, presumably) which enables this at runtime.
> > That's far more user-friendly.
>
> But who will turn it on in debugfs?

But who will turn it on in Kconfig? Just a handful of developers. We
could add SONFIG_DEBUG_SG to the list in
Documentation/process/submit-checklist.rst, but nobody reads that.

Also, a whole bunch of defconfigs set CONFIG_DEBUG_SG=y and some
googling indicates that they aren't the only ones...

> It should be default for debugging
> kernels, so that users using them would report the error.

Well. This isn't the first time we've wanted to enable expensive (or
noisy) debugging things in debug kernels, by any means.

So how could we define a debug kernel in which it's OK to enable such
things?

- Could be "it's an -rc kernel". But then we'd be enabling a bunch of
untested code when Linus cuts a release.

- Could be "it's an -rc kernel with SUBLEVEL <= 5". But then we risk
unexpected things happening when Linux cuts -rc6, which still isn't
good.

- How about "it's an -rc kernel with odd-numbered SUBLEVEL and
SUBLEVEL <= 5". That way everybody who runs -rc1, -rc3 and -rc5 will
have kvmalloc debugging enabled. That's potentially nasty because
vmalloc is much slower than kmalloc. But kvmalloc() is only used for
large and probably infrequent allocations, so it's probably OK.

I wonder how we get at SUBLEVEL from within .c.