Re: md_raid: mdX_raid6 looping after sync_action "check" to "idle" transition

From: Yu Kuai
Date: Wed Mar 15 2023 - 05:55:59 EST

Next message: Simon Horman: "Re: [PATCH net 1/2] net: dsa: don't error out when drivers return ETH_DATA_LEN in .port_max_mtu()"
Previous message: Vladimir Oltean: "Re: [PATCH] net: dsa: mv88e6xxx: don't dispose of Global2 IRQ mappings from mdiobus code"
In reply to: Guoqing Jiang: "Re: md_raid: mdX_raid6 looping after sync_action "check" to "idle" transition"
Next in thread: Donald Buczek: "Re: md_raid: mdX_raid6 looping after sync_action "check" to "idle" transition"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Hi,

在 2023/03/15 17:30, Guoqing Jiang 写道:

Just borrow this thread to discuss, I think this commit might have
problem in some corner cases:

t1:                t2:
action_store
mddev_lock
if (mddev->sync_thread)
   mddev_unlock
   md_unregister_thread
                md_check_recovery
                 set_bit(MD_RECOVERY_RUNNING, &mddev->recovery)
                 queue_work(md_misc_wq, &mddev->del_work)
   mddev_lock_nointr
   md_reap_sync_thread
   // clear running
mddev_lock

t3:
md_start_sync
// running is not set

What does 'running' mean? MD_RECOVERY_RUNNING?

Our test report a problem that can be cause by this in theory, by we
can't be sure for now...

I guess you tried to describe racy between

action_store -> md_register_thread

and

md_start_sync -> md_register_thread

Didn't you already fix them in the series?

[PATCH -next 0/5] md: fix uaf for sync_thread

Sorry, I didn't follow the problem and also your series, I might try your
test with latest mainline kernel if the test is available somewhere.

We thought about how to fix this, instead of calling
md_register_thread() here to wait for sync_thread to be done
synchronisely,

IMO, md_register_thread just create and wake a thread, not sure why it
waits for sync_thread.

we do this asynchronously like what md_set_readonly() and do_md_stop() does.

Still, I don't have clear picture about the problem, so I can't judge it.

Sorry that I didn't explain the problem clear. Let me explain the
problem we meet first:

1) raid10d is waiting for sync_thread to stop:
raid10d
md_unregister_thread
kthread_stop

2) sync_thread is waiting for io to finish:
md_do_sync
wait_event(... atomic_read(&mddev->recovery_active) == 0)

3) io is waiting for raid10d to finish(online crash found 2 io in conf->retry_list)

Additional information from online crash:
mddev->recovery = 29, // DONE, RUNING, INTR is set

PID: 138293 TASK: ffff0000de89a900 CPU: 7 COMMAND: "md0_resync"
#0 [ffffa00107c178a0] __switch_to at ffffa0010001d75c
#1 [ffffa00107c178d0] __schedule at ffffa001017c7f14
#2 [ffffa00107c179f0] schedule at ffffa001017c880c
#3 [ffffa00107c17a20] md_do_sync at ffffa0010129cdb4
#4 [ffffa00107c17d50] md_thread at ffffa00101290d9c
#5 [ffffa00107c17e50] kthread at ffffa00100187a74

PID: 138294 TASK: ffff0000eba13d80 CPU: 5 COMMAND: "md0_resync"
#0 [ffffa00107e47a60] __switch_to at ffffa0010001d75c
#1 [ffffa00107e47a90] __schedule at ffffa001017c7f14
#2 [ffffa00107e47bb0] schedule at ffffa001017c880c
#3 [ffffa00107e47be0] schedule_timeout at ffffa001017d1298
#4 [ffffa00107e47d50] md_thread at ffffa00101290ee8
#5 [ffffa00107e47e50] kthread at ffffa00100187a74
// there are two sync_thread for md0

I believe the root cause is that two sync_thread exist for the same
mddev, and this is how I think this is possible:

t1: t2:
action_store
mddev_lock
if (mddev->sync_thread)
mddev_unlock
md_unregister_thread
// first sync_thread is done
md_check_recovery
set_bit(MD_RECOVERY_RUNNING, &mddev->recovery)
queue_work(md_misc_wq, &mddev->del_work)
mddev_lock_nointr
md_reap_sync_thread
// MD_RECOVERY_RUNNING is cleared
mddev_unlock

t3:
md_start_sync
// second sync_thread is registed

t3:
md_check_recovery
queue_work(md_misc_wq, &mddev->del_work)
// MD_RECOVERY_RUNNING is not set, a new sync_thread can be started

This is just guess, I can't reporduce the problem yet. Please let me
know if you have any questions

Thanks,
Kuai

Next message: Simon Horman: "Re: [PATCH net 1/2] net: dsa: don't error out when drivers return ETH_DATA_LEN in .port_max_mtu()"
Previous message: Vladimir Oltean: "Re: [PATCH] net: dsa: mv88e6xxx: don't dispose of Global2 IRQ mappings from mdiobus code"
In reply to: Guoqing Jiang: "Re: md_raid: mdX_raid6 looping after sync_action "check" to "idle" transition"
Next in thread: Donald Buczek: "Re: md_raid: mdX_raid6 looping after sync_action "check" to "idle" transition"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]