Re: [linus:master] BUILD REGRESSION 2bd6bf03f4c1c59381d62c61d03f6cc3fe71f66e

From: Fengguang Wu
Date: Sun Sep 17 2017 - 20:48:54 EST


Hi Linus,

On Sun, Sep 17, 2017 at 08:31:56AM -0700, Linus Torvalds wrote:
Fengguang,
it looks like the kernel build robot _only_ tests the actual rc
kernels, and doesn't bisect down where the error started.

Nah, that's an illusion. :)

It's a per-branch summary report _in addition to_ per-bisect reports.
The former shows all active (not-yet-fixed) error/warnings in the
current branch HEAD; the latter shows result of one bisect.

Typically for all error messages showed in this summary report, there
have been individual bisect reports sent out to the relevant authors
and committers. I'll give concrete examples in the bottom.

Any change that when it notices an error, it would bisect it, like it
does for linux-next?

It should already be so -- otherwise it's a bug in 0day robot. In fact
your tree _implicitly_ receives much more tests than linux-next, and
linux-next receives more tests than other individual developer trees.
It works like this:

The robot will normally test all pushed branch HEADs of all git trees.
IOW, each of your (and others') git push will trigger tests -- unless
when occasionally the robot cannot catch up.

The RC kernels will effectively receive _much more_ tests, since
developers typically base their git branches on RC releases. So
whenever they do git push, the triggered tests on their branch HEAD
will automatically cover its base RC kernel.

Whenever an error is found in a commit (typically the branch HEAD),
the robot will traverse backwards in its git history and test these
critical points until a GOOD point is found for starting the bisect:

- the branch's BASE commit (typically an RC kernel)
- the official releases (eg. 4.14 => 4.13 => 4.12 => ...)

We'll give up when the bug is found to exist in too old kernel, since
old bugs are likely either uninteresting (no one cares to fix) or hard
to bisect.

On Sat, Sep 16, 2017 at 11:02 PM, kbuild test robot
<fengguang.wu@xxxxxxxxx> wrote:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
2bd6bf03f4c1c59381d62c61d03f6cc3fe71f66e Linux 4.14-rc1

arch/alpha/include/asm/mmu_context.h:160:24: error: invalid type argument of '->' (have 'int')

Error ids grouped by kconfigs:

recent_errors
âââ alpha-allmodconfig
â âââ arch-alpha-include-asm-mmu_context.h:error:implicit-declaration-of-function-task_thread_info
â âââ arch-alpha-include-asm-mmu_context.h:error:invalid-type-argument-of-(have-int-)

The bisect report was sent here:

https://lkml.org/lkml/2017/9/16/187

And a fix was freshly posted here:

https://patchwork.kernel.org/patch/9954963/

âââ cris-allyesconfig
â âââ drivers-tty-serial-8250_core.c:error:unrecognizable-insn:
â âââ drivers-tty-serial-8250_core.c:internal-compiler-error:in-extract_insn-at-recog.c

Bisected and reported here:

https://www.spinics.net/lists/linux-serial/msg27175.html

âââ ia64-allmodconfig
â âââ drivers-clocksource-timer-of.h:error:field-clkevt-has-incomplete-type
â âââ include-linux-kernel.h:error:dereferencing-pointer-to-incomplete-type-struct-clock_event_device

Reported here

https://www.spinics.net/lists/kernel/msg2556450.html

which may be fixed by this RFC patch:

https://patchwork.kernel.org/patch/9939191/

âââ ia64-allyesconfig
â âââ drivers-clocksource-timer-of.h:error:field-clkevt-has-incomplete-type
â âââ include-linux-kernel.h:error:dereferencing-pointer-to-incomplete-type-struct-clock_event_device

Ditto.

âââ mips-jmr3927_defconfig
â âââ arch-mips-vdso-elf.S:error:march-r3900-requires-mfp32
â âââ arch-mips-vdso-gettimeofday.c:error:march-r3900-requires-mfp32
â âââ arch-mips-vdso-sigreturn.S:error:march-r3900-requires-mfp32
â âââ cc1:error:march-r3900-requires-mfp32

That's rather old bug that I gave up repeatedly reporting:

https://www.linux-mips.org/archives/linux-mips/2016-03/msg00215.html

âââ parisc-allmodconfig
â âââ ERROR:__cmpxchg_u64-drivers-net-ethernet-intel-i40e-i40e.ko-undefined

Reported here:

https://lkml.org/lkml/2017/9/10/100

âââ sparc64-allmodconfig
â âââ arch-sparc-include-asm-mmu_context_64.h:error:implicit-declaration-of-function-per_cpu
â âââ arch-sparc-include-asm-mmu_context_64.h:error:implicit-declaration-of-function-smp_processor_id
â âââ arch-sparc-include-asm-mmu_context_64.h:error:per_cpu_secondary_mm-undeclared-(first-use-in-this-function)
â âââ arch-sparc-include-asm-mmu_context_64.h:error:unknown-type-name-per_cpu_secondary_mm

Reported here:

https://lists.01.org/pipermail/kbuild-all/2017-August/037613.html
https://lists.01.org/pipermail/kbuild-all/2017-September/037968.html

And recently fixed here:

https://patchwork.kernel.org/patch/9946375/

âââ x86_64-randconfig-s4-09170918
âââ net-netfilter-nf_nat_core.c:note:in-expansion-of-macro-if

Reported here:

https://lkml.org/lkml/2017/9/16/203

As you may see, all the errors mentioned in this summary report have
been individually bisected and reported somewhere before.

Regards,
Fengguang