Re: [RFC][PATCH] Add __GFP_ZERO to alloc_cpumask_var_node() if ptr is zero

From: Ingo Molnar
Date: Sun Dec 06 2015 - 12:30:35 EST



* Rusty Russell <rusty@xxxxxxxxxxxxxxx> wrote:

> Ingo Molnar <mingo@xxxxxxxxxx> writes:
> > * Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:
> >
> >> On Fri, 04 Dec 2015 12:05:12 +1030
> >> Rusty Russell <rusty@xxxxxxxxxxxxxxx> wrote:
> >>
> >> > This is clever, but I would advise against such subtle code. We will never be
> >> > able to remove this code once it is in.
> >> >
> >> > Would suggest making the non-CPUMASK_OFFSTACK stubs write garbage into the
> >> > cpumasks instead, iff !(flags & __GFP_ZERO).
> >>
> >> I actually thought of the same thing, but thought it was a bit harsh. If others
> >> think that's a better solution, then I'll submit a patch to do that.
> >
> > That just makes things more fragile - 'garbage' will spread the breakage, and if
> > the breakage is subtle, it will spread subtle breakage.
> >
> > So why not use a kzmalloc_node() [equivalent] call instead of kmalloc_node(), to
> > make sure it's all zeroed instead of uninitialized?
>
> OTOH, why not make *every* kmalloc a kzmalloc?

The big difference to alloc_cpumask_var_node() is that kmalloc() is well-defined
in the sense that it will return uninitialized buffers (sometimes even poisoned
ones), all the time.

But alloc_cpumask_var_node() will return a zeroed cpumask 99.9% of the time when
the kernel being run is using on-stack cpumasks. So it's very easy to not
initialize and not discover it for extended periods of time.

As it happened here, and as was fixed with the patch. Hence my suggestion.

> The issue here is not that the issue is subtle (not using a zeroing allocator is
> a pretty clear bug), it's that it's papered over by the normal config.

Exactly.

> If we had a config option already to garbage-fill allocations, it'd be a simple
> solution.
>
> I don't think there are great answers here. But adding more subtle zeroing
> semantics feels wrong, even if it will mostly Just Work.

It's not subtle if the naming clearly reflects it (hence my suggestion to rename
the API) - and the status quo for on-stack allocations is zeroing anyway, so it's
not a big jump...

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/