On Tue, Jul 11, 2023 at 12:46 AM Christian König
<christian.koenig@xxxxxxx> wrote:
[SNIP]I think that is only true because of the drm_sched_fence_free() path?
Well that won't work like this, using the the SLAB_TYPESAFE_BY_RCU flag---Reviewed-by: Luben Tuikov <luben.tuikov@xxxxxxx>
drivers/gpu/drm/scheduler/sched_fence.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/scheduler/sched_fence.c b/drivers/gpu/drm/scheduler/sched_fence.c
index ef120475e7c6..b624711c6e03 100644
--- a/drivers/gpu/drm/scheduler/sched_fence.c
+++ b/drivers/gpu/drm/scheduler/sched_fence.c
@@ -35,7 +35,7 @@ static int __init drm_sched_fence_slab_init(void)
{
sched_fence_slab = kmem_cache_create(
"drm_sched_fence", sizeof(struct drm_sched_fence), 0,
- SLAB_HWCACHE_ALIGN, NULL);
+ SLAB_HWCACHE_ALIGN | SLAB_TYPESAFE_BY_RCU, NULL);
if (!sched_fence_slab)
return -ENOMEM;
But let it simmer for 24 hours so Christian can see it too (CC-ed).
was suggested before and is not allowed for dma_fence objects.
The flag SLAB_TYPESAFE_BY_RCU can only be used if the objects in the
slab can't be reused within an RCU time period or if that reuse doesn't
matter for the logic. And exactly that is not guaranteed for dma_fence
objects.
But that could also use call_rcu(). (It looks like it is only an
error path.)
It should also not be necessary because the scheduler fences arePossibly I need to use rcu_read_lock()? But I think the idr_lock
released using call_rcu:
static void drm_sched_fence_release_scheduled(struct dma_fence *f)
{
struct drm_sched_fence *fence = to_drm_sched_fence(f);
dma_fence_put(fence->parent);
call_rcu(&fence->finished.rcu, drm_sched_fence_free_rcu);
}
That looks more like you have a reference count problem in MSM.
which protected dma_fence_get_rcu() (and is held until the fence is
removed from fence_idr, before it's reference is dropped in
__msm_gem_submit_destroy()) makes that unnecessary.
BR,
-R
Regards,
Christian.