How about in uacce_remove
On 2022/6/16 下午4:14, Jean-Philippe Brucker wrote:
On Thu, Jun 16, 2022 at 12:10:18PM +0800, Zhangfei Gao wrote:
I agree we need something, what I mean is that this check is notThe check if (!uacce->parent->driver) is required, otherwise NULL pointerdiff --git a/drivers/misc/uacce/uacce.c b/drivers/misc/uacce/uacce.cI don't think this is useful, because the core clears parent->driver after
index 281c54003edc..b6219c6bfb48 100644
--- a/drivers/misc/uacce/uacce.c
+++ b/drivers/misc/uacce/uacce.c
@@ -136,9 +136,16 @@ static int uacce_fops_open(struct inode *inode, struct file *filep)
if (!q)
return -ENOMEM;
+ mutex_lock(&uacce->queues_lock);
+
+ if (!uacce->parent->driver) {
having run uacce_remove():
rmmod hisi_zip open()
... uacce_fops_open()
__device_release_driver() ...
pci_device_remove()
hisi_zip_remove()
hisi_qm_uninit()
uacce_remove()
... ...
mutex_lock(uacce->queues_lock)
... if (!uacce->parent->driver)
device_unbind_cleanup() /* driver still valid, proceed */
dev->driver = NULL
may happen.
sufficient.
iommu_sva_bind_deviceAh right, relying on the return value of bind() wouldn't be enough even if
const struct iommu_ops *ops = dev_iommu_ops(dev); ->
dev->iommu->iommu_dev->ops
rmmod has no issue, but remove parent pci device has the issue.
we mandated SVA.
[...]
The refcount only ensures that the uacce_device object is not freed asI think we need the global uacce_mutex to serialize uacce_remove() andSince fops_open get char device refcount, uacce_release will not happen
uacce_fops_open(). uacce_remove() would do everything, including
xa_erase(), while holding that mutex. And uacce_fops_open() would try to
obtain the uacce object from the xarray while holding the mutex, which
fails if the uacce object is being removed.
until open returns.
long as there are open fds. But uacce_remove() can run while there are
open fds, or fds in the process of being opened. And atfer uacce_remove()
runs, the uacce_device object still exists but is mostly unusable. For
example once the module is freed, uacce->ops is not valid anymore. But
currently uacce_fops_open() may dereference the ops in this case:
uacce_fops_open()
if (!uacce->parent->driver)
/* Still valid, keep going */
... rmmod
uacce_remove()
... free_module()
uacce->ops->get_queue() /* BUG */
uacce_remove should wait for uacce->queues_lock, until fops_open release the lock.
If open happen just after the uacce_remove: unlock, uacce_bind_queue in open should fail.
Accessing uacce->ops after free_module() is a use-after-free. We need allyou men parent release the resources.
the fops to synchronize with uacce_remove() to ensure they don't use anyAfter fops_open, currently we are counting on parent driver stop all dma first, then call uacce_remove, which is assumption.
resource of the parent after it's been freed.
Like drivers/crypto/hisilicon/zip/zip_main.c: hisi_qm_wait_task_finish, which will wait uacce_release.
If comments this , there may other issue,
Unable to handle kernel paging request at virtual address ffff80000b700204
pc : hisi_qm_cache_wb.part.0+0x2c/0xa0
I see uacce_fops_poll() may have the same problem, and should be insideDo we need consider this, uacce_remove can happen anytime but not waiting dma stop?
uacce_mutex.
Not sure uacce_mutex can do this.
Currently the sequence is
mutex_lock(&uacce->queues_lock);
mutex_lock(&uacce_mutex);
Or we set all the callbacks of uacce_ops to NULL?
Module_get/put only works for module, but not for removing device.
Thanks
Thanks,
Jean