Re: erofs pointer corruption and kernel crash
From: Arseniy Krasnov
Date: Mon Apr 13 2026 - 03:20:24 EST
13.04.2026 10:08, Gao Xiang пишет:
>
>
> On 2026/4/11 23:10, Arseniy Krasnov wrote:
>>
>>
>> 10.04.2026 18:41, Gao Xiang пишет:
>>> Hi Arseniy,
>>>
>>> On 2026/4/10 21:27, Arseniy Krasnov wrote:
>>>>
>>>>
>>>> 10.04.2026 15:20, Gao Xiang пишет:
>>>>>
>>>>>
>>>>> On 2026/4/10 19:37, Arseniy Krasnov wrote:
>>>>>
>>>>> (drop unrelated folks since they all subscribed erofs mailing list)
>>>>>
>>>>>>
>>>>>>
>>>>>> 10.04.2026 11:31, Gao Xiang wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> On 2026/4/10 16:13, Arseniy Krasnov wrote:
>
> ...
>
>>>>>>>
>>>>>>> I need more informations to find some clues.
>>>>>>
>>>>>>
>>>>>>
>>>>>> So reproduced again with this debug patch which adds magic to 'struct z_erofs_pcluster' and prints 'struct folio'
>>>>>> when pointer in 'private' is passed to 'erofs_onlinefolio_end()'. In short - 'private' points to 'struct z_erofs_pcluster'.
>>>>> First, erofs-utils 1.8.10 doesn't support `-E48bit`:
>>>>> only erofs-utils 1.9+ ship it as an experimental
>>>>> feature, see Changelog; so I think you're using
>>>>> modified erofs-utils 1.8.10:
>>>>> https://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs-utils.git/tree/ChangeLog
>>>>>
>>>>> ```
>>>>> erofs-utils 1.9
>>>>>
>>>>> * This release includes the following updates:
>>>>> - Add 48-bit layout support for larger filesystems (EXPERIMENTAL);
>>>>> ```
>>>>>
>>>>> Second, I'm pretty sure this issue is related to
>>>>> experimenal `-E48bit`, and those information is
>>>>> not enough for me to find the root cause, so I
>>>>> need to find a way to reproduce myself: It may
>>>>> take time; you could debug yourself but I don't
>>>>> think it's an easy task if you don't quite familiar
>>>>> with the EROFS codebase.
>>>>>
>>>>> Anyway I really suggest if you need a rush solution
>>>>> for production, don't use `-E48bit + zstd` like
>>>>> this for now: try to use other options like
>>>>> `-zzstd -C65536 -Efragments` instead since those
>>>>> are common production choices.
>>>>
>>>> Ok thanks for this advice! One more question: currently we use this options:
>>>> "zstd,22 --max-extent-bytes 65536 -E48bit". Ok we remove "zstd,22" and "E48bit",
>>>> but what about "--max-extent-bytes 65536" - is it considered stable option?
>>>> Or it is better to use your version: "-zzstd -C65536 -Efragments" ?
>>>
>>> I'm not sure how you find this
>>> "zstd,22 --max-extent-bytes 65536 -E48bit" combination.
>>>
>>> My suggestion based on production is that as long as
>>> you don't use `-zzstd` ++ `-E48bit`, it should be fine.
>>>
>>> If you need smaller images, I suggest: `-zlzma,9 -C65536 -Efragments`
>>> Or like Android, they all use `-zlz4hc`,
>>> Or zstd, but don't add `-E48bit`.
>>>
>>> As for "--max-extent-bytes 65536", it can be dropped
>>> since if `-E48bit` is not used, it only has negative
>>> impacts.
>>>
>>> In short, `-E48bit` + `-zzstd` + `--max-extent-bytes`
>>> enables new unaligned compression for zstd, but it's
>>> a relatively new feature, I still still some time to
>>> stablize it but my own time is limited and all things
>>> are always prioritized.
>>
>> Ok, thanks for this advice!
>
> FYI, I can reproduce this issue locally with `-E48bit`
> on in 600s.
>
> I do think it's a `-E48bit` + zstd issue so
> non-`-E48bit` won't be impacted and I will find time
> to troubleshoot it this week.
Yes, without '-E48bit' we also can't reproduce it for entire weekend on several boards. No such panics.
Thanks
>
> Thanks,
> Gao Xiang
>
>>
>> Thanks
>>
>>>
>>> Thanks,
>>> Gao Xiang
>>>
>>>>
>>>> Thanks
>>>>
>>>>>
>>>>> Thanks,
>>>>> Gao Xiang
>>>
>