Re: [PATCH v7 09/12] sysfs: fix deadlock race with module removal

From: Bart Van Assche
Date: Mon Sep 20 2021 - 17:38:50 EST


On 9/17/21 10:04 PM, Luis Chamberlain wrote:
A sketch of how this can happen follows:

CPU A CPU B
whatever_store()
module_unload
mutex_lock(foo)
mutex_lock(foo)
del_gendisk(zram->disk);
device_del()
device_remove_groups()

In this situation whatever_store() is waiting for the mutex foo to
become unlocked, but that won't happen until module removal is complete.
But module removal won't complete until the sysfs file being poked
completes which is waiting for a lock already held.

If I remember correctly I encountered the deadlock scenario described
above for the first time about ten years ago while working on the SCST
project. We solved this deadlock by removing the sysfs attributes from
the module unload code before grabbing mutex_lock(foo), e.g. by calling
sysfs_remove_file(). This works because calling sysfs_remove_file()
multiple times in a row is safe. Is that solution good enough for the
zram driver?

Thanks,

Bart.