Re: [BUG] possible race between md_free_disk and md_notify_reboot

From: Guillaume Morin
Date: Wed Feb 19 2025 - 22:06:06 EST




> On Feb 19, 2025, at 10:26 PM, Yu Kuai <yukuai1@xxxxxxxxxxxxxxx> wrote:
>
> Hi
>
>> 在 2025/02/20 3:43, Guillaume Morin 写道:
>> There seems to be nothing in the code that tries to prevent this
>> specific race
>
> Did you noticed that mddev_get() is called from md_notify_reboot() while
> the lock is still held, and how can mddev_free() can race here if
> mddev_get() succeed?
>

Yes, I did notice that. Though it was not clear to me how it was guaranteed that mddev_get() would fail as mddev_free() does not check or synchronize with the active atomic. But i might be missing some logic outside of md.c. If there is guarantee that mddev_get() fails after mddev_free is called this would definitely address the uaf concern.

However the main race we have reported is where the item pointed by the *next* pointer (n) in the loop is erased by mddev_free(). The next pointer is saved by the list iteration macro and afaict there is nothing preventing from this item to be deleted while the lock is released and then the poisoned values to be accessed at the end of loop when the next pointer is used.

If this is incorrect, I am not sure how the crash would happen. I should have mentioned that originally but it is reproducible so it’s not a one off issue. Do you see an alternative explanation for the crash?

Thanks in advance for your help


> Thanks,
> Kuai
>