Re: nios2 crash/hang in mainline due to 'lib: update LZ4 compressor module'

From: Tobias Klauser
Date: Thu Mar 02 2017 - 08:31:38 EST


On 2017-03-01 at 23:50:03 +0100, Sandra Loosemore <sandra@xxxxxxxxxxxxxxxx> wrote:
> On 03/01/2017 11:58 AM, Sven Schmidt wrote:
> >Hi Guenter, Tobias and Sandra,
> >
> >thanks for your effort here.
> >
> >On Tue, Feb 28, 2017 at 10:14:13AM -0800, Guenter Roeck wrote:
> >>On Tue, Feb 28, 2017 at 10:53:56AM -0700, Sandra Loosemore wrote:
> >>>On 02/28/2017 08:53 AM, Tobias Klauser wrote:
> >>>>(adding Sandra Loosemore to Cc due to possible relation to gcc/binutils
> >>>>for nios2)
> >>>>
> >>>>On 2017-02-26 at 22:03:38 +0100, Guenter Roeck <linux@xxxxxxxxxxxx> wrote:
> >>>>>Hi Sven,
> >>>>>
> >>>>>my qemu test for nios2 started failing with commit 4e1a33b105dd ("lib:
> >>>>>update LZ4 compressor module"). The test hangs early during boot before
> >>>>>any console output is seen. Reverting the offending patch as well as the
> >>>>>subsequent lz4 related patches fixes the problem. Disabling CONFIG_RD_LZ4
> >>>>>and with it other LZ4 options also fixes it (as does adding "return -EINVAL;"
> >>>>>at the top of the LZ4 decompression code). For reference, bisect log
> >>>>>is attached.
> >>>>>
> >>>>>I tried with buildroot toolchains using gcc 6.1.0 as well as 6.3.0
> >>>>>and binutils 2.26.1. Scripts used to run the tests are available at
> >>>>>https://github.com/groeck/linux-build-test/tree/master/rootfs/nios2.
> >>>>>Qemu is from qemu mainline or qemu v2.8 with nios2 patches applied.
> >>>>
> >>>>Looks like this is somehow related to gcc/binutils. Using GCC 4.8.3 and
> >>>>binutils 2.24.51 (both from from Sourcery CodeBench Lite 2014.05) I can
> >>>>get a kernel booting on latest master branch. AFAICT, none of the
> >>>>LZ4_decompress_* functions are called during boot.
> >>>>
> >
> >It seems a bit strange that code which is not actually called causes problems like that.
> >
> >Please let me know if and how I may help you figure out what's happening, especially
> >regarding the differences between the previous LZ4 and the current implementation.
> >
> >>>>However, using a self-built GCC 7.0 (20161127) and binutils 2.27 I can
> >>>>reproduce the problem you see using the instructions Guenter provided in
> >>>>the reply to Sven.
> >>>>
> >>>>I'll try to dig a bit deeper from here on. Any suggestions on what to
> >>>>look out for wrt the differences between the gcc/binutils version are
> >>>>welcome of course.
> >>>
> >>>This message doesn't give me enough context to know what is going on,
> >>>especially without seeing the rest of the thread. Generally speaking,
> >>>Mentor recommends you use one of our stable releases instead of trying to
> >>>roll your own from mainline sources. As an upstream binutils and gcc
> >>>maintainer I do try my best to look at bug reports for those components, but
> >>>I need a reproducible standalone testcase and specific versions of the
> >>>different components involved.
> >>>
> >>The problem is also seen with Sourcery CodeBench Lite 2016.11-32 (gcc 6.2.0,
> >>binutils 2.26.51). I can provide additional details if needed, but we don't
> >>have a well enough understanding of the problem to be able to provide a
> >>reduced size test case. The test used to reproduce the problem is available
> >>at https://github.com/groeck/linux-build-test/tree/master/rootfs/nios2,
> >>run on the ToT linux kernel.
>
> Just a suggestion: can you try binutils trunk, too? Alan Modra and
> I just tracked down and fixed a bug with the linker creating bad
> executables that the kernel's ELF loader couldn't properly map into
> memory. IIUC it only affected programs that use dynamic libraries,
> but maybe there was more to it than that. In any case it would be
> good to know if the problem has already been fixed before
> investigating further.

Thanks for the suggestion.

Just tried it with a kernel compiled with binutils trunk as of today
(2.28.51.20170302) and latest gcc snapshot (7.0.1 20170226).
Unfortunately, the issue still persists.

Tobias