Re: PROBLEM: zstd bzImage decompression fails for some x86_32 config on 5.9-rc1

From: Nick Terrell
Date: Tue Sep 29 2020 - 01:16:45 EST




> On Sep 28, 2020, at 11:02 AM, Nick Terrell <terrelln@xxxxxx> wrote:
>
>
>
>> On Sep 28, 2020, at 1:55 AM, Feng Tang <feng.tang@xxxxxxxxx> wrote:
>>
>> Hi Nick,
>>
>> 0day has found some kernel decomprssion failure case since 5.9-rc1 (X86_32
>> build), and it could be related with ZSTD code, though initially we bisected
>> to some other commits.
>>
>> The error messages are:
>>
>> early console in setup code
>> Wrong EFI loader signature.
>> early console in extract_kernel
>> input_data: 0x046f50b4
>> input_len: 0x01ebbeb6
>> output: 0x01000000
>> output_len: 0x04fc535c
>> kernel_total_size: 0x055f5000
>> needed_size: 0x055f5000
>>
>> Decompressing Linux...
>>
>> ZSTD-compressed data is corrupt
>>
>> This could be reproduced by compiling the kernel with attached config,
>> and use QEMU to boot it.
>>
>> We suspect it could be related with the kernel size, as we only see
>> it on big kernel, and some more info are:
>>
>> * If we remove a lot of kernel config to build a much smaller kernel,
>> it will boot fine
>>
>> * If we change the zstd algorithm from zstd22 to zstd19, the kernel will
>> boot fine with below patch
>>
>> diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile
>> index 3962f59..8fe71ba 100644
>> --- a/arch/x86/boot/compressed/Makefile
>> +++ b/arch/x86/boot/compressed/Makefile
>> @@ -147,7 +147,7 @@ $(obj)/vmlinux.bin.lzo: $(vmlinux.bin.all-y) FORCE
>> $(obj)/vmlinux.bin.zst: $(vmlinux.bin.all-y) FORCE
>> - $(call if_changed,zstd22)
>> + $(call if_changed,zstd)
>>
>>
>> Please let me know if you need more info, and sorry for the late report
>> as we just tracked down to this point.
>
> Thanks for the report, I will look into it today.

CC: Petr Malat

I’ve successfully reproduced, and found the issue. It turns out that this
patch [0] from Petr Malat fixes the issue. As I mentioned in that thread, his
fix corresponds to this upstream commit [1].

Can we get Petr's patch merged into v5.9?

This bug only happens when the window size is > 8 MB. A non-kernel workaround
would be to compress the kernel level 19 instead of level 22, which uses an
8 MB window size, instead of a 128 MB window size.

The reason it only shows up for large kernels, is that the code is only buggy
when an offset > 8 MB is used, so a kernel <= 8 MB can't trigger the bug.

Best,
Nick

[0] https://lkml.org/lkml/2020/9/14/94
[1] https://github.com/facebook/zstd/commit/8a5c0c98ae5a7884694589d7a69bc99011add94d

> Best,
> Nick
>
>> Thanks,
>> Feng
>>
>>
>>
>> <zstd_x86_32.config>