Re: XFS metadata CRC errors on zram block device on ppc64le architecture
From: Dusty Mabe
Date: Thu Aug 03 2023 - 17:32:47 EST
On 8/2/23 08:00, Dusty Mabe wrote:
>
>
> On 8/2/23 07:03, Hannes Reinecke wrote:
>> On 8/2/23 11:41, Christoph Hellwig wrote:
>>> On Tue, Aug 01, 2023 at 11:31:37PM -0400, Dusty Mabe wrote:
>>>> We ran a kernel bisect and narrowed it down to offending commit af8b04c6:
>>>>
>>>> ```
>>>> [root@ibm-p8-kvm-03-guest-02 linux]# git bisect good
>>>> af8b04c63708fa730c0257084fab91fb2a9cecc4 is the first bad commit
>>>> commit af8b04c63708fa730c0257084fab91fb2a9cecc4
>>>> Author: Christoph Hellwig <hch@xxxxxx>
>>>> Date: Tue Apr 11 19:14:46 2023 +0200
>>>>
>>>> zram: simplify bvec iteration in __zram_make_request
>>>>
>>>> bio_for_each_segment synthetize bvecs that never cross page boundaries, so
>>>> don't duplicate that work in an inner loop.
>>>
>>>> Any ideas on how to fix the problem?
>>>
>>> So the interesting cases are:
>>>
>>> - ppc64 usually uses 64k page sizes
>>> - ppc64 is somewhat cache incoherent (compared to say x86)
>>>
>>> Let me think of this a bit more.
>>
>> Would need to be confirmed first that 64k pages really are in use
>> (eg we compile ppc64le with 4k page sizes ...).
>> Dusty?
>> For which page size did you compile your kernel?
>
>
> For Fedora the configuration is to enable 64k pages with CONFIG_PPC_64K_PAGES=y
> https://src.fedoraproject.org/rpms/kernel/blob/064c1675a16b4d379b42ab6c3397632ca54ad897/f/kernel-ppc64le-fedora.config#_4791
>
> I used the same configuration when running the git bisect.
Naive question from my side: would this be a candidate for reverting while we investigate the root cause?
Dusty