Re: [PATCH 01/12] extarray: define helpers for arrays defined in linker scripts
From: Richard Biener
Date: Wed Oct 19 2016 - 11:04:00 EST
On Tue, 18 Oct 2016, Luis R. Rodriguez wrote:
> On Tue, Oct 18, 2016 at 10:08:44AM +0200, Vegard Nossum wrote:
> >
>
> Vegard, thanks for bringing this to attention!
>
> Including this hunk for those that were originally not CC'd
> on the original patch.
>
> > diff --git a/include/linux/extarray.h b/include/linux/extarray.h
> > new file mode 100644
> > index 0000000..1185abc
> > --- /dev/null
> > +++ b/include/linux/extarray.h
> > @@ -0,0 +1,65 @@
> > +#ifndef LINUX_EXTARRAY_H
> > +#define LINUX_EXTARRAY_H
> > +
> > +#include <linux/compiler.h>
> > +
> > +/*
> > + * A common pattern in the kernel
>
> This is quite an understatement, as you noted on the cover letter, I've been
> reviewing these uses, in particular where such uses are used also for the
> linker script for quite some time now. I've devised a generic API for these
> uses then to help with making it easier to add new entries, making easier
> to avoid typos, and giving us some new features from this effort. This the
> linker table and section range APIs [0] [1]. Given my review of all this code in
> the kernel I'd say this use is not just common, its a well established practice
> by now, across *all* architectures.
>
> In fact I would not be surprised if this usage and practice has extended far
> beyond the kernel by now, and custom firmwares / linker scripts also use this
> to mimic the practice on the kernel to take advantage of this old hack to stuff
> special code / data structures into special ELF sections. As such, I would not
> think this is just an issue for Linux.
>
> Upon a quick cursory review I can confirm such use is prevalent in Xen as well,
> for instance the Xen Security Module framework uses it since eons ago [2]. Its
> used elsewhere on the Xen linker script too though... so XSM is one small
> example at a *quick* glance.
>
> [0] https://lkml.kernel.org/r/1471642875-5957-1-git-send-email-mcgrof@xxxxxxxxxx
> [1] https://lkml.kernel.org/r/1471642454-5679-1-git-send-email-mcgrof@xxxxxxxxxx
> [2] https://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=d046f361dc937d8fc179cc2da168f571726cb5a0
>
> > + is to put certain objects in a specific
> > + * named section and then create variables in the linker script pointing
> > + * to the start and the end of this section. These variables are declared
> > + * as extern arrays to allow C code to iterate over the list of objects
>
> Right, sometimes its just a char pointer to represent code, other times it
> custom data structure.
>
> > + *
> > + * In C, comparing pointers to objects in two different arrays is undefined
> > + * GCC version 7.0 and newer (commit 73447cc5d17) will aggressively optimize
> > + * out such comparisons if it can prove that the two pointers point to
> > + * different arrays (which is the case when the arrays are declared as two
> > + * separate variables). This breaks the typical code used to iterate over
> > + * such arrays.
>
> Eek, thanks, I checked commit 73447cc5d17 on gcc [2] however its commit log
> is pretty vague to me and does not seem to justify why exactly this optimization
> is done, nor what effort was put in to vet for it, as such I cannot agree or
> disagree with it. Logically though, to me these are just pointers, as such,
> its not clear to me why gcc can take such optimization to make such assumptions.
>
> Since I cannot infer any more from this commit, and given how prevalent I
> expect this use to be (clearly even outside of Linux) I am inclined to consider
> this a gcc bug, which requires at least an opt-in optimization rather than this
> being a default. Had anyone reported this ??
>
> Richard ?
The commit implements a long-standing failure to optimize trivial pointer
comparisons that arise for example from libstdc++. PR65686 contains
a simple C example:
mytype f(struct S *e)
{
mytype x;
if(&x != e->pu)
__builtin_memcpy(&x, e->pu, sizeof(unsigned));
return x;
}
where GCC before the commit could not optimize the &x != e->pu test
as trivial false.
The commit uses points-to analysis results to simplify pointer equality
tests (which is in itself a very good way to expose bugs in points-to
analysis -- I've fixed a bunch of them thanks to this ;)).
> [2] https://github.com/gcc-mirror/gcc/commit/73447cc5d17
>
> > + *
> > + * One way to get around this limitation is to force GCC to lose any array
> > + * information about the pointers before we compare them. We can use e.g.
> > + * OPTIMIZER_HIDE_VAR() for this.
>
> As far as Linux is concerned though your patch set addresses covering a few
> cases, it does not cover all, so while it might help boot x86 on some type of
> system, by no means would I expect it suffice to boot all Linux systems.
> Additionally, while it may *fix* boot on x86, are we certain the use of
> OPTIMIZER_HIDE_VAR() may not *break* on certain platforms ? I would ask such
> type of intrusive changes which affect the linker script to go well beyond
> just 0-day for testing -- Guenter Roeck was kind enough to let me test my
> series for linker table / section ranges on his infrastructure which covers
> much architectures and toolchains not addressed by 0-day. I'd expect such type
> of testing for these types of changes, which can affect many architectures.
>
> Additionally you asked for this to series to be considered a stable
> patch, if this just fixes x86 boot, but breaks other architecture it would
> be a considerable regression. I'd prefer we first determine if we want this
> change reverted on gcc, or if it at least can be disabled by default.
> I really do expect shit to hit the fan elsewhere, so this work around
> doesn't same like the next best step at this point.
>
> > + *
> > + * This file defines a few helpers to deal with declaring and accessing
> > + * such linker-script-defined arrays.
> > + */
> > +
> > +
> > +#define DECLARE_EXTARRAY(type, name) \
> > + extern type __start_##name[]; \
> > + extern type __stop_##name[]; \
> > +
> > +#define _ext_start(name, tmp) \
> > + ({ \
> > + typeof(*__start_##name) *tmp = __start_##name; \
> > + OPTIMIZER_HIDE_VAR(tmp); \
> > + tmp; \
> > + })
> > +
> > +#define ext_start(name) _ext_start(name, __UNIQUE_ID(ext_start_))
> > +
> > +#define _ext_end(name, tmp) \
> > + ({ \
> > + typeof(*__stop_##name) *tmp = __stop_##name; \
> > + OPTIMIZER_HIDE_VAR(tmp); \
> > + tmp; \
> > + })
> > +
> > +#define ext_end(name) _ext_end(name, __UNIQUE_ID(ext_end_))
> > +
> > +#define _ext_size(name, tmp1, tmp2) \
> > + ({ \
> > + typeof(*__start_##name) *tmp1 = __start_##name; \
> > + typeof(*__stop_##name) *tmp2 = __stop_##name; \
> > + OPTIMIZER_HIDE_VAR(tmp1); \
> > + OPTIMIZER_HIDE_VAR(tmp2); \
> > + tmp2 - tmp1; \
> > + })
> > +
> > +#define ext_size(name) \
> > + _ext_size(name, __UNIQUE_ID(ext_size1_), __UNIQUE_ID(ext_size2_))
> > +
> > +#define ext_for_each(var, name) \
> > + for (var = ext_start(name); var != ext_end(name); var++)
> > +
> > +#endif
>
> FWIW, with linker able we'd do this type of "fix" in one place later,
> if we wanted it, provided all uses are ported, of course.
>
> > On 10/17/2016 01:45 PM, Peter Zijlstra wrote:
> > > On Mon, Oct 17, 2016 at 01:27:08PM +0200, Vegard Nossum wrote:
> > > > On 10/17/2016 11:09 AM, Peter Zijlstra wrote:
> > > > > On Mon, Oct 17, 2016 at 11:01:13AM +0200, Jiri Slaby wrote:
> > > > > > On the top of that, it's incorrect C according to the standard.
> > > > >
> > > > > According to the standard non of the kernel has any chance in hell of
> > > > > working, so don't pretend you care about that :-)
> > > >
> > > > I think that's a bit of a false dilemma. It's obviously true that kernel
> > > > code does not conform to the standards, but that doesn't mean it's not
> > > > something we should strive towards or care about in general. It helps
> > > > static analysis tools, compiler diversity, etc.
> > >
> > > Sure, but this, two separately allocated objects their address should
> > > not be compared and therefore... stuff is explicitly relied upon by the
> > > kernel in many places.
> > >
> > > We have workarounds in various places, and this patch adds yet another
> > > instance of it.
> > >
> > > The workaround is simply confusing the compiler enough to have it not do
> > > the 'optimization'. But we very much still rely on this 'undefined'
> > > behaviour.
> > >
> > > I think it makes more sense to explicitly allow it than to obfuscate our
> > > code and run the risk a future compiler will see through our tricks.
> >
> > Actually, I think we're all a bit wrong.
> >
> > It's not comparing the pointers that's undefined behavior, that was my
> > bad trying to oversimplify the issue.
> >
> > Of course, comparing arbitrary (valid) pointers with each other is not
> > undefined behavior. The undefined behavior is technically doing iter++
> > past the end of the array,
>
> What defines the end of the array?
>
> > that is "creating" a pointer that points outside an array.
>
> I mean, its just a pointer.
>
> This is the sort of information that would be useful for the commit log.
>
> > What gcc does wrong is that it sees us iterating over one array and the
> > optimizer concludes that the iterator can never point to the second
> > array.
>
> Which second array? Why does it make this huge assumption ?
>
> > I'd argue there's no real undefined behaviour happening here.
> > Thus, the code _is_ correct and valid C as it stands, it just doesn't do
> > what you would expect intuitively.
>
> People have relied on its functionality for years, if this is going
> to change it would be I think a good idea to at least have a strong
> justification rather than having us trying to interpret the original
> goal of the gcc patch.
The original goal of the gcc patch wasn't to break the kernel. The
goal was to implement an optimization other compilers do since a long
time.
> > However, from the linker script's point of view there is no difference
> > between one big array and two consecutive arrays of the same type, and
> > the fact that the compiler doesn't know the memory layout -- although
> > we do.
>
> In Linux we don't mix/match pointer types, we typically use two char *
> pointers for start/end of code, or a data structure pointers for start/
> end of list, that's it. Its not clear to me why gcc believes it is correct
> to assume start != end.
>
> > When we call OPTIMIZER_HIDE_VAR() we're not really "confusing" the
> > compiler, it's more like calling a function defined in a different file
> > (therefore the compiler can't see into it) that returns a pointer which
> > we _know_ is valid because it still points to (the end of) the array.
>
> Its a hack to work around the optimization, if we are to do this I think
> its important we all first understand *why* the original commit was done,
> without this - it would seem the current optimization will just go breaking
> quite a bit of projects. Your changeset would also require much more work
> (or we port things to the linker table / section ranges API, and just do the
> fix with those wrappers, as it would mean 1 collateral evolution rather than 2
> -- one for this fix and then one for the linker table API).
>
> > If we obtain a pointer somehow (be it from a hardware register or from
> > the stack of a userspace process), the compiler must assume that it's a
> > valid pointer, right? The only reason it didn't do that for the arrays
> > was because we had declared the two arrays as separate variables so it
> > "knew" it was the case that the iterator initialised to point to one
> > array could never point to the second array.
>
> But why is this invalid C all of a sudden with optimizations ?
>
> foo.h:
>
> extern char *start_foo;
> extern char *end_foo;
>
> foo.c:
>
> #include "foo.h"
>
> char *str = "hello";
>
> char *start_foo;
> char *end_foo;
>
> int main(void)
> {
> start_foo = str;
> end_foo = str;
>
> return !(start_foo == end_foo);
> }
Why should this be invalid?
Let's look at what is special about the kernel usage. Looking
at GCC bug 77964 it is declaring
extern struct builtin_fw __start_builtin_fw[];
extern struct builtin_fw __end_builtin_fw[];
which are extern zero-sized arrays. I suppose they are nowhere
actually defined but these symbols are created by the linker script only.
I can think of adding workarounds to GCC to try catching this
special case which would be treating a pointer to a extern
object of size zero (so you can't do this in C++ ...) as a
pointer to possibly any other global variable (given actual
data layout may make the pointer value equal to it).
Index: gcc/tree-ssa-structalias.c
===================================================================
--- gcc/tree-ssa-structalias.c (revision 241327)
+++ gcc/tree-ssa-structalias.c (working copy)
@@ -2944,6 +2944,17 @@ get_constraint_for_ssa_var (tree t, vec<
DECL_PT_UID (t) = DECL_UID (node->decl);
t = node->decl;
}
+
+ /* If this is an external zero-sized object consider it to
+ point to NONLOCAL as well. */
+ if (DECL_EXTERNAL (t)
+ && (! DECL_SIZE (t) || integer_zerop (DECL_SIZE (t))))
+ {
+ cexpr.var = nonlocal_id;
+ cexpr.type = SCALAR;
+ cexpr.offset = 0;
+ results->safe_push (cexpr);
+ }
}
vi = get_vi_for_tree (t);
Note that any issues exposed by the pointer equality simplification
are possible issues in previous GCC with regard to alias analysis.
I notice we try to honor symbol interposition when directly comparing
__start_builtin_fw != __end_builtin_fw for example. But we certainly
do not "honor" interposition for alias analysis for, say
extern int a[1];
extern int b[1];
where a store to a[0] is not considered aliasing b[0].
Richard.
> > Anyway, IANALL.
>
> IGINYA - I guess I'm not young anymore, what's IANALL mean? :)
>
> Luis
>
>
--
Richard Biener <rguenther@xxxxxxx>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)