Re: vmlinux ELF header sometimes corrupt

From: Rasmus Villemoes
Date: Fri Jan 24 2020 - 06:41:58 EST


On 24/01/2020 11.50, Michael Ellerman wrote:
> Rasmus Villemoes <linux@xxxxxxxxxxxxxxxxxx> writes:
>> I'm building for a ppc32 (mpc8309) target using Yocto, and I'm hitting a
>> very hard to debug problem that maybe someone else has encountered. This
>> doesn't happen always, perhaps 1 in 8 times or something like that.
>>
>> The issue is that when the build gets to do "${CROSS}objcopy -O binary
>> ... vmlinux", vmlinux is not (no longer) a proper ELF file, so naturally
>> that fails with
>>
>> powerpc-oe-linux-objcopy:vmlinux: file format not recognized
>>
>>
>> Any ideas?
>
> Not really sorry. Haven't seen or heard of that before.
>
> Are you doing a parallel make? If so does -j 1 fix it?

Hard to say, I'll have to try that a number of times to see if it can be
reproduced with that setting.

> If it seems like sortextable is at fault then strace'ing it would be my
> next step.

I don't think sortextable is at fault, that was just my first "I know
that at least pokes around in the ELF file". I do "cp vmlinux
vmlinux.before_sort" and "cp vmlinux vmlinux.after_sort", and both of
those copies are proper ELF files - and the .after_sort is identical to
the corrupt vmlinux apart from vmlinux ending up with its ELF header wiped.

So it's something that happens during some later build step (Yocto has a
lot of steps), perhaps "make modules" or "make modules_install" or
something ends up somehow deciding "hey, vmlinux isn't quite uptodate,
let's nuke it". I'm not even sure it's a Kbuild problem, but I've seen
the same thing happen using another meta-build system called oe-lite,
which is why I'm not primarily suspecting the Yocto logic.

Rasmus