在 2023/03/14 21:55, Guoqing Jiang 写道:
Hi, Guoqing,
On 3/14/23 21:25, Marc Smith wrote:
On Mon, Feb 8, 2021 at 7:49 PM Guoqing Jiang
<guoqing.jiang@xxxxxxxxxxxxxxx> wrote:
Hi Donald,I'm still hitting this issue with Linux 5.4.229 -- it looks like 1/2
On 2/8/21 19:41, Donald Buczek wrote:
Dear Guoqing,Great. I will send a formal patch with your reported-by and tested-by.
On 08.02.21 15:53, Guoqing Jiang wrote:
Yes, this works. No deadlock after >11000 seconds,
On 2/8/21 12:38, Donald Buczek wrote:
I don't think other places can be blocked while hold mutex, otherwise5. maybe don't hold reconfig_mutex when try to unregisterAs above: While we wait for the sync thread to terminate, wouldn't it
sync_thread, like this.
/* resync has finished, collect result */
mddev_unlock(mddev);
md_unregister_thread(&mddev->sync_thread);
mddev_lock(mddev);
be a problem, if another user space operation takes the mutex?
these places can cause potential deadlock. Please try above two lines
change. And perhaps others have better idea.
(Time till deadlock from previous runs/seconds: 1723, 37, 434, 1265,
3500, 1136, 109, 1892, 1060, 664, 84, 315, 12, 820 )
Thanks,
Guoqing
of the patches that supposedly resolve this were applied to the stable
kernels, however, one was omitted due to a regression:
md: don't unregister sync_thread with reconfig_mutex held (upstream
commit 8b48ec23cc51a4e7c8dbaef5f34ebe67e1a80934)
Just borrow this thread to discuss, I think this commit might have
problem in some corner cases:
t1: t2:
action_store
mddev_lock
if (mddev->sync_thread)
mddev_unlock
md_unregister_thread
md_check_recovery
set_bit(MD_RECOVERY_RUNNING, &mddev->recovery)
queue_work(md_misc_wq, &mddev->del_work)
mddev_lock_nointr
md_reap_sync_thread
// clear running
mddev_lock
t3:
md_start_sync
// running is not set
Our test report a problem that can be cause by this in theory, by we
can't be sure for now...
We thought about how to fix this, instead of calling
md_register_thread() here to wait for sync_thread to be done
synchronisely,
we do this asynchronously like what md_set_readonly() and do_md_stop() does.