[PATCH v2 5/9] drm/i915: Use kmap_local_page() in gem/selftests/i915_gem_coherency.c

From: Zhao Liu
Date: Wed Mar 29 2023 - 03:25:31 EST


From: Zhao Liu <zhao1.liu@xxxxxxxxx>

The use of kmap_atomic() is being deprecated in favor of
kmap_local_page()[1], and this patch converts the call from
kmap_atomic() to kmap_local_page().

The main difference between atomic and local mappings is that local
mappings doesn't disable page faults or preemption (the preemption is
disabled for !PREEMPT_RT case, otherwise it only disables migration)..

With kmap_local_page(), we can avoid the often unwanted side effect of
unnecessary page faults or preemption disables.

In drm/i915/gem/selftests/i915_gem_coherency.c, functions cpu_set()
and cpu_get() mainly uses mapping to flush cache and assign the value.
There're 2 reasons why cpu_set() and cpu_get() don't need to disable
pagefaults and preemption for mapping:

1. The flush operation is safe. cpu_set() and cpu_get() call
drm_clflush_virt_range() to use CLFLUSHOPT or WBINVD to flush. Since
CLFLUSHOPT is global on x86 and WBINVD is called on each cpu in
drm_clflush_virt_range(), the flush operation is global.

2. Any context switch caused by preemption or page faults (page fault
may cause sleep) doesn't affect the validity of local mapping.

Therefore, cpu_set() and cpu_get() are functions where the use of
kmap_local_page() in place of kmap_atomic() is correctly suited.

Convert the calls of kmap_atomic() / kunmap_atomic() to
kmap_local_page() / kunmap_local().

[1]: https://lore.kernel.org/all/20220813220034.806698-1-ira.weiny@xxxxxxxxx

v2:
* Dropped hot plug related description since it has nothing to do with
kmap_local_page().
* No code change since v1, and added description of the motivation of
using kmap_local_page().

Suggested-by: Dave Hansen <dave.hansen@xxxxxxxxx>
Suggested-by: Ira Weiny <ira.weiny@xxxxxxxxx>
Suggested-by: Fabio M. De Francesco <fmdefrancesco@xxxxxxxxx>
Signed-off-by: Zhao Liu <zhao1.liu@xxxxxxxxx>
---
Suggested by credits:
Dave: Referred to his explanation about cache flush.
Ira: Referred to his task document, review comments and explanation
about cache flush.
Fabio: Referred to his boiler plate commit message and his description
about why kmap_local_page() should be preferred.
---
.../gpu/drm/i915/gem/selftests/i915_gem_coherency.c | 12 ++++--------
1 file changed, 4 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c
index 3bef1beec7cb..beeb3e12eccc 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c
@@ -24,7 +24,6 @@ static int cpu_set(struct context *ctx, unsigned long offset, u32 v)
{
unsigned int needs_clflush;
struct page *page;
- void *map;
u32 *cpu;
int err;

@@ -34,8 +33,7 @@ static int cpu_set(struct context *ctx, unsigned long offset, u32 v)
goto out;

page = i915_gem_object_get_page(ctx->obj, offset >> PAGE_SHIFT);
- map = kmap_atomic(page);
- cpu = map + offset_in_page(offset);
+ cpu = kmap_local_page(page) + offset_in_page(offset);

if (needs_clflush & CLFLUSH_BEFORE)
drm_clflush_virt_range(cpu, sizeof(*cpu));
@@ -45,7 +43,7 @@ static int cpu_set(struct context *ctx, unsigned long offset, u32 v)
if (needs_clflush & CLFLUSH_AFTER)
drm_clflush_virt_range(cpu, sizeof(*cpu));

- kunmap_atomic(map);
+ kunmap_local(cpu);
i915_gem_object_finish_access(ctx->obj);

out:
@@ -57,7 +55,6 @@ static int cpu_get(struct context *ctx, unsigned long offset, u32 *v)
{
unsigned int needs_clflush;
struct page *page;
- void *map;
u32 *cpu;
int err;

@@ -67,15 +64,14 @@ static int cpu_get(struct context *ctx, unsigned long offset, u32 *v)
goto out;

page = i915_gem_object_get_page(ctx->obj, offset >> PAGE_SHIFT);
- map = kmap_atomic(page);
- cpu = map + offset_in_page(offset);
+ cpu = kmap_local_page(page) + offset_in_page(offset);

if (needs_clflush & CLFLUSH_BEFORE)
drm_clflush_virt_range(cpu, sizeof(*cpu));

*v = *cpu;

- kunmap_atomic(map);
+ kunmap_local(cpu);
i915_gem_object_finish_access(ctx->obj);

out:
--
2.34.1