Re: [PATCH v4 2/2] x86/purgatory: Make sure we fail the build if purgatory.ro has missing symbols

From: Arvind Sankar
Date: Fri Mar 13 2020 - 00:43:33 EST


On Thu, Mar 12, 2020 at 01:50:39PM +0100, Borislav Petkov wrote:
> On Thu, Mar 12, 2020 at 12:58:24PM +0100, Hans de Goede wrote:
> > My version of this patch has already been tested this way. It is
>
> Tested with kexec maybe but if the 0day bot keeps finding breakage, that
> ain't good enough.
>
> > 1. Things are already broken, my patch just exposes the brokenness
> > of some configs, it is not actually breaking things (well it breaks
> > the build, changing a silent brokenness into an obvious one).
>
> As I already explained, that is not good enough.
>
> > 2. I send out the first version of this patch on 7 October 2019, it
> > has not seen any reaction until now. So I'm sending out new versions
> > quickly now that this issue is finally getting some attention...
>
> And that is never the right approach.
>
> Maintainers are busy as hell so !urgent stuff gets to wait. Spamming
> them with more patchsets does not help - fixing stuff properly does.
>
> So, to sum up: if Arvind's approach is the better one, then we should do
> that and s390 should be fixed this way too. And all tested. And we will
> remove the hurry element from it all since it has not been noticed so
> far so it is not urgent and we can take our time and fix it properly.
>
> Ok?
>
> Thx.
>
> --
> Regards/Gruss,
> Boris.
>
> https://people.kernel.org/tglx/notes-about-netiquette

If I could try to summarize the situation here:
- the purgatory requires filtering out certain CFLAGS/other settings set
for the generic kernel in order to work correctly
- the patch proposed by Hans de Goede will detect missing filters at
build time rather than when kexec is executed
- the filtering is currently not perfect as demonstrated by issues that
0day bot is finding -- but the patchset will find these problems at
build time rather than runtime
- there might be a slight optimization as proposed by me [1] but it
might have problems as in [2] even if it seems to work

I think the patch as of v5 [0] is useful right now, to catch CFLAGS
additions that aren't currently being filtered correctly. The real
problem is that there exist CFLAGS that should be used for all source
files in the kernel, and there are CFLAGS (eg tracing, stack check etc)
that should only be used for the kernel proper. For special
compilations, such as boot stubs, vdso's, purgatory we should have the
generic CFLAGS but not the kernel-proper CFLAGS. The issue currently is
that these special compilations need to filter out all the flags added
for kernel-proper, and this is a moving target as more tracing/sanity
flags get added. Neither the solution of simply re-initializing CFLAGS
(which will miss generic CFLAGS) nor trying to filter out CFLAGS (which
will miss new kernel-proper CFLAGS) works very well. I think ideally
splitting these into independent variables, i.e. BASE_FLAGS that can be
used for everything, and KERNEL_FLAGS only to be used for the kernel
proper is likely eventually the better solution, rather than conflating
both into KBUILD_CFLAGS.

But to move forward incrementally, patch v5 is probably the cleanest. My
suggestion in [1] I'm thinking is changing things significantly for
kexec, by changing the purgatory from a relocatable object file into an
actual executable, and might have knock-on implications that need to be
reviewed and tested carefully before it can be merged, as shown by [2].

[0] https://lore.kernel.org/lkml/20200312114951.56009-1-hdegoede@xxxxxxxxxx/
[1] https://lore.kernel.org/lkml/20200312001006.GA170175@xxxxxxxxxxxxxxxxxx/
[2] https://lore.kernel.org/lkml/20200312182322.GA506594@xxxxxxxxxxxxxxxxxx/