Re: [PATCH 1/9] drm/i915: Use kmap_local_page() in gem/i915_gem_object.c

From: Fabio M. De Francesco
Date: Sat Oct 29 2022 - 07:17:37 EST


On lunedì 17 ottobre 2022 11:37:17 CEST Zhao Liu wrote:
> From: Zhao Liu <zhao1.liu@xxxxxxxxx>
>
> The use of kmap_atomic() is being deprecated in favor of
> kmap_local_page()[1].
>
> The main difference between atomic and local mappings is that local
> mappings doesn't disable page faults or preemption.

You are right about about page faults which are never disabled by
kmap_local_page(). However kmap_atomic might not disable preemption. It
depends on CONFIG_PREEMPT_RT.

Please refer to how kmap_atomic_prot() works (this function is called by
kmap_atomic() when kernels have HIGHMEM enabled).

>
> There're 2 reasons why i915_gem_object_read_from_page_kmap() doesn't
> need to disable pagefaults and preemption for mapping:
>
> 1. The flush operation is safe for CPU hotplug when preemption is not
> disabled.

I'm confused here. Why are you talking about CPU hotplug?
In any case, developers should never rely on implicit calls of
preempt_disable() for the reasons said above. Therefore, flush operations
should be allowed regardless that kmap_atomic() potential side effect.

> In drm/i915/gem/i915_gem_object.c, the function
> i915_gem_object_read_from_page_kmap() calls drm_clflush_virt_range()

If I recall correctly, drm_clflush_virt_range() can always be called with page
faults and preemption enabled. If so, this is enough to say that the
conversion is safe.

Is this code explicitly related to flushing the cache lines before removing /
adding CPUs? If I recall correctly, there are several other reasons behind the
need to issue cache lines flushes. Am I wrong about this?

Can you please say more about what I'm missing here?

> to
> use CLFLUSHOPT or WBINVD to flush. Since CLFLUSHOPT is global on x86
> and WBINVD is called on each cpu in drm_clflush_virt_range(), the flush
> operation is global and any issue with cpu's being added or removed
> can be handled safely.

Again your main concern is about CPU hotplug.

Even if I'm missing something, do we really need all these details about the
inner workings of drm_clflush_virt_range()?

I'm not an expert, so may be that I'm wrong about all I wrote above.

Therefore, can you please elaborate a little more for readers with very little
knowledge of these kinds of things (like me and perhaps others)?

> 2. Any context switch caused by preemption or sleep (pagefault may
> cause sleep) doesn't affect the validity of local mapping.

I'd replace "preemption or sleep" with "preemption and page faults" since
yourself then added that page faults lead to tasks being put to sleep.

> Therefore, i915_gem_object_read_from_page_kmap() is a function where
> the use of kmap_local_page() in place of kmap_atomic() is correctly
> suited.
>
> Convert the calls of kmap_atomic() / kunmap_atomic() to
> kmap_local_page() / kunmap_local().
>
> And remove the redundant variable that stores the address of the mapped
> page since kunmap_local() can accept any pointer within the page.
>
> [1]: https://lore.kernel.org/all/20220813220034.806698-1-ira.weiny@xxxxxxxxx
>
> Suggested-by: Dave Hansen <dave.hansen@xxxxxxxxx>
> Suggested-by: Ira Weiny <ira.weiny@xxxxxxxxx>
> Suggested-by: Fabio M. De Francesco <fmdefrancesco@xxxxxxxxx>
> Signed-off-by: Zhao Liu <zhao1.liu@xxxxxxxxx>
> ---
> Suggested by credits:
> Dave: Referred to his explanation about cache flush.
> Ira: Referred to his task document, review comments and explanation about
> cache flush.
> Fabio: Referred to his boiler plate commit message.
> ---
> drivers/gpu/drm/i915/gem/i915_gem_object.c | 8 +++-----
> 1 file changed, 3 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c
> b/drivers/gpu/drm/i915/gem/i915_gem_object.c index
369006c5317f..a0072abed75e 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
> @@ -413,17 +413,15 @@ void __i915_gem_object_invalidate_frontbuffer(struct
> drm_i915_gem_object *obj, static void
> i915_gem_object_read_from_page_kmap(struct drm_i915_gem_object *obj, u64
offset, void
> *dst, int size) {
> - void *src_map;
> void *src_ptr;
>
> - src_map = kmap_atomic(i915_gem_object_get_page(obj, offset >>
PAGE_SHIFT));
> -
> - src_ptr = src_map + offset_in_page(offset);
> + src_ptr = kmap_local_page(i915_gem_object_get_page(obj, offset >>
PAGE_SHIFT))
> + + offset_in_page(offset);
> if (!(obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_READ))
> drm_clflush_virt_range(src_ptr, size);
> memcpy(dst, src_ptr, size);
>
> - kunmap_atomic(src_map);
> + kunmap_local(src_ptr);
> }
>
> static void

The changes look good, but I'd like to better understand the commit message.

Thanks,

Fabio