Re: 6.6.8 stable: crash in folio_mark_dirty

From: Hillf Danton
Date: Sat Dec 30 2023 - 20:29:19 EST


On Sat, Dec 30, 2023 at 10:23:26AM -0500 Genes Lists <lists@xxxxxxxxxxxx>
> Apologies in advance, but I cannot git bisect this since machine was
> running for 10 days on 6.6.8 before this happened.
>
> Dec 30 07:00:36 s6 kernel: ------------[ cut here ]------------
> Dec 30 07:00:36 s6 kernel: WARNING: CPU: 0 PID: 521524 at mm/page-writeback.c:2668 __folio_mark_dirty (??:?)
> Dec 30 07:00:36 s6 kernel: CPU: 0 PID: 521524 Comm: rsync Not tainted 6.6.8-stable-1 #13 d238f5ab6a206cdb0cc5cd72f8688230f23d58df
> Dec 30 07:00:36 s6 kernel: block_dirty_folio (??:?)
> Dec 30 07:00:36 s6 kernel: unmap_page_range (??:?)
> Dec 30 07:00:36 s6 kernel: unmap_vmas (??:?)
> Dec 30 07:00:36 s6 kernel: exit_mmap (??:?)
> Dec 30 07:00:36 s6 kernel: __mmput (??:?)
> Dec 30 07:00:36 s6 kernel: do_exit (??:?)
> Dec 30 07:00:36 s6 kernel: do_group_exit (??:?)
> Dec 30 07:00:36 s6 kernel: __x64_sys_exit_group (??:?)
> Dec 30 07:00:36 s6 kernel: do_syscall_64 (??:?)

See what comes out if race is handled.
Only for thoughts.

--- x/mm/page-writeback.c
+++ y/mm/page-writeback.c
@@ -2661,12 +2661,19 @@ void __folio_mark_dirty(struct folio *fo
{
unsigned long flags;

+again:
xa_lock_irqsave(&mapping->i_pages, flags);
- if (folio->mapping) { /* Race with truncate? */
+ if (folio->mapping && mapping == folio->mapping) {
WARN_ON_ONCE(warn && !folio_test_uptodate(folio));
folio_account_dirtied(folio, mapping);
__xa_set_mark(&mapping->i_pages, folio_index(folio),
PAGECACHE_TAG_DIRTY);
+ } else if (folio->mapping) { /* Race with truncate? */
+ struct address_space *tmp = folio->mapping;
+
+ xa_unlock_irqrestore(&mapping->i_pages, flags);
+ mapping = tmp;
+ goto again;
}
xa_unlock_irqrestore(&mapping->i_pages, flags);
}
--