Re: [BUG] Build error for 4.15-rc3 kernel caused by patch "kbuild: Add a cache for generated variables"

From: Masahiro Yamada
Date: Wed Dec 20 2017 - 23:02:28 EST

Hi Doug

2017-12-21 2:07 GMT+09:00 Doug Anderson <dianders@xxxxxxxxxxxx>:
> Hi,
> On Tue, Dec 19, 2017 at 6:29 PM, Masahiro Yamada
> <yamada.masahiro@xxxxxxxxxxxxx> wrote:
>> 2017-12-19 2:17 GMT+09:00 Doug Anderson <dianders@xxxxxxxxxxxx>:
>>> Hi,
>>> On Mon, Dec 18, 2017 at 7:50 AM, Masahiro Yamada
>>> <yamada.masahiro@xxxxxxxxxxxxx> wrote:
>>>> 2017-12-18 23:56 GMT+09:00 Masahiro Yamada <yamada.masahiro@xxxxxxxxxxxxx>:
>>>>> 2017-12-17 7:35 GMT+09:00 Yang Shi <yang.s@xxxxxxxxxxxxxxx>:
>>>>>> Hi folks,
>>>>>> I just upgraded gcc to 6.4 on my centos 7 machine by Arnd's suggestion. But,
>>>>>> I ran into the below compile error with 4.15-rc3 kernel:
>>>>>> In file included from ./include/uapi/linux/uuid.h:21:0,
>>>>>> from ./include/linux/uuid.h:19,
>>>>>> from ./include/linux/mod_devicetable.h:12,
>>>>>> from scripts/mod/devicetable-offsets.c:2:
>>>>>> ./include/linux/string.h:8:20: fatal error: stdarg.h: No such file or
>>>>>> directory
>>>>>> #include <stdarg.h>
>>>>>> I bisected to commit 3298b690b21cdbe6b2ae8076d9147027f396f2b1 ("kbuild: Add
>>>>>> a cache for generated variables"). Once I revert this commit, kernel build
>>>>>> is fine.
>>>>>> gcc 4.8.5 is fine to build kernel with this commit.
>>>>>> I'm not quite sure if this is a bug or my gcc install is skewed although it
>>>>>> can build kernel without that commit since that commit might exacerbate the
>>>>>> case.
>>>>>> Any hint is appreciated
>>>>> Today, I was also hit with the same error
>>>>> when I was compiling linux-next.
>>>>> I am not so sure why this error happens, but
>>>>> "make clean" will probably fix the problem.
>>>>> You need to do "make clean" to blow
>>>>> when you upgrade your compiler.
>>>>> This is nasty, though...
>>>> I got it.
>>>> The following line in the top-level Makefile.
>>>> NOSTDINC_FLAGS += -nostdinc -isystem $(call shell-cached,$(CC)
>>>> -print-file-name=include)
>>>> If the stale result of -print-file-name is stored in the cache file,
>>>> the compiler fails to find <stdarg.h>
>>> Nice catch! Do you have any idea how we can fix it? I suppose we
>>> could add a single (non-cached) call to CC somewhere in there to get
>>> CC's version and clobber the cache if the version changes. Is that
>>> the best approach here?
>>> In general I remember thinking about the gcc upgrade problem when I
>>> was first experimenting with the cache. At the time my assumption was
>>> that if someone updated their gcc then they really ought to be doing a
>>> clean anyway (I wasn't sure if the build system somehow enforced this,
>>> but I didn't think so). Doing an incremental build after a compiler
>>> upgrade just seems (to me) to be asking for asking for trouble, or in
>>> the very least seems like it's not what the user wanted (if you update
>>> your compiler you almost certainly want it to be used to build all of
>>> your code, don't you?)
>> I agree.
>> When you upgrade your compiler,
>> you need to remove not only cache files, but also all object files.
>> So, "make clean" is the most reasonable way.
>>> Even if it's wise to do a clean after a compiler upgrade, it still
>>> seems pretty non-ideal that a user has to decipher an arcane error
>>> like this, so it seems like we should see what we can do to detect
>>> this case for the user and help them out. Perhaps rather than
>>> clobbering the cache we should actually suggest that the user run a
>>> "make clean"?
>> Right. I think it's a good thing to do.
> Are you planning on doing this, or is this something you'd like me to
> attempt? I'm a bit busy in the last two days before I go on Christmas
> break, but I can try to squeeze something like this in since the root
> of the issue is a patch that I authored. Let me know.

I am busy too these days.
Your contribution is very appreciated.

> If this is something you'd like me to do, let me know if you think the
> right solution is to detect the problem and warn the user or if the
> right solution is to just blow away the cache. It would be up to you,
> but I'd tend to go the route of warning the user because:
> * The user should almost certainly do a "make clean" to really ensure
> no mismatch between object files.
> * I could imagine that trying to invoke "make clean" automatically
> might be complicated.

I agree with both.

When compiler upgrade is detected,
we can terminate building
with a hint message to prompt users to run "make clean"

>> BTW, "sudo make install" or "sudo make modules_install" could
>> add some cache entries by super user privilege?
>> (For example, run build targets with CROSS_COMPILE,
>> but run install targets without CROSS_COMPILE,
>> install targets will produce different cache entries.)
>> If so, "make clean" in normal user privilege
>> can not remove cache files...
> Hrm. That doesn't sound nice. I guess this could be solved by
> something like your "no-compiler-targets" patch, but IIUC that didn't
> include "install" or "module_install". I guess the other option would
> be to somehow detect "UID=0" specifically and not generate the cache?
> -Doug

That would be a solution.
We can skip cache generation for some sort of targets.

Best Regards
Masahiro Yamada