On 2018-07-04 10:31 AM, Christian KÃnig wrote:
Am 26.06.2018 um 16:31 schrieb Michel DÃnzer:Thanks. How about the following:
From: Michel DÃnzer <michel.daenzer@xxxxxxx>Well, the explanation here is not correct. For multiple BOs using the
Fixes the BUG_ON spuriously triggering under the following
circumstances:
* ttm_eu_reserve_buffers processes a list containing multiple BOs using
ÂÂ the same reservation object, so it calls
ÂÂ reservation_object_reserve_shared with that reservation object once
ÂÂ for each such BO.
* In reservation_object_reserve_shared, old->shared_count ==
ÂÂ old->shared_max - 1, so obj->staged is freed in preparation of an
ÂÂ in-place update.
* ttm_eu_fence_buffer_objects calls reservation_object_add_shared_fence
ÂÂ once for each of the BOs above, always with the same fence.
* The first call adds the fence in the remaining free slot, after which
ÂÂ old->shared_count == old->shared_max.
same reservation object we won't call
reservation_object_add_shared_fence() multiple times because we move
those to the duplicates list in ttm_eu_reserve_buffers().
But this bug can still happen because we call
reservation_object_add_shared_fence() manually with fences for the same
context in a couple of places.
One prominent case which comes to my mind are for the VM BOs during
updates. Another possibility are VRAM BOs which need to be cleared.
* ttm_eu_reserve_buffers calls reservation_object_reserve_shared.
* In reservation_object_reserve_shared, shared_count == shared_max - 1,
so obj->staged is freed in preparation of an in-place update.
* ttm_eu_fence_buffer_objects calls reservation_object_add_shared_fence,
after which shared_count == shared_max.
* The amdgpu driver also calls reservation_object_add_shared_fence for
the same reservation object, and the BUG_ON triggers.
However, nothing bad would happen in
reservation_object_add_shared_inplace, since all fences use the same
context, so they can only occupy a single slot.
Prevent this by moving the BUG_ON to where an overflow would actually
happen (e.g. if a buggy caller didn't call
reservation_object_reserve_shared before).
Also, I'll add a reference to https://bugs.freedesktop.org/106418 in v2,
as I suspect this fix is necessary under the circumstances described
there as well.