Re: BUG: unable to handle kernel NULL pointer dereference at 00000000000000a8 in nilfs_segctor_do_construct
From: ARAI Shun-ichi
Date: Sat Mar 28 2020 - 05:45:44 EST
In Msg <874kuapb2s.fsf@xxxxxxxxxx>;
Subject "Re: BUG: unable to handle kernel NULL pointer dereference at 00000000000000a8 in nilfs_segctor_do_construct":
> Tomas Hlavaty <tom@xxxxxxxxxx> writes:
>>>> 2) Can you mount the corrupted(?) partition from a recent version of
>>>> kernel ?
>
> I tried the following Linux kernel versions:
>
> - v4.19
> - v5.4
> - v5.5.11
>
> and still get the crash
Ryusuke Konishi pointed out:
In Msg <CAKFNMomjWkNvHvHkEp=Jv_BiGPNj=oLEChyoXX1yCj5xctAkMA@xxxxxxxxxxxxxx>;
Subject "Re: BUG: kernel NULL pointer dereference, address: 00000000000000a8":
> As the result of bisection, it turned out that commit
> f4bdb2697ccc9cecf1a9de86905c309ad901da4c on 5.3.y
> ("mm/filemap.c: don't initiate writeback if mapping has no dirty pages")
> triggers the crash.
This commit modifies __filemap_fdatawrite_range() as follows.
[before]
if (!mapping_cap_writeback_dirty(mapping))
return 0;
[after]
if (!mapping_cap_writeback_dirty(mapping) ||
!mapping_tagged(mapping, PAGECACHE_TAG_DIRTY))
return 0;
I did simple test with this code (Kernel 5.5.13).
[test]
if (!mapping_cap_writeback_dirty(mapping) ||
mapping_tagged(mapping, PAGECACHE_TAG_WRITEBACK))
return 0;
It does not cause crash by the test (without long-term operation). So,
I think that it may be related to PAGECACHE_TAG_TOWRITE.
One possible(?) scenario is:
0. some write operation
1. sync (WB_SYNC_ALL)
2. tagged "PAGECACHE_TAG_TOWRITE"
3. __filemap_fdatawrite_range() is called and returns successfully
(but no-op)
4. some data is/are free-ed
(because of 3.)
5. crash at test/setting writeback for free-ed data
nilfs_segctor_do_construct()
nilfs_segctor_prepare_write()
set_page_writeback()
How about this?