Re: [PATCH v2 6/9] drm/i915: Use kmap_local_page() in gem/selftests/i915_gem_context.c

From: Tvrtko Ursulin
Date: Fri Mar 31 2023 - 04:06:17 EST



On 31/03/2023 04:33, Ira Weiny wrote:
Zhao Liu wrote:
From: Zhao Liu <zhao1.liu@xxxxxxxxx>

The use of kmap_atomic() is being deprecated in favor of
kmap_local_page()[1], and this patch converts the call from
kmap_atomic() to kmap_local_page().

The main difference between atomic and local mappings is that local
mappings doesn't disable page faults or preemption.

With kmap_local_page(), we can avoid the often unwanted side effect of
unnecessary page faults or preemption disables.

In drm/i915/gem/selftests/i915_gem_context.c, functions cpu_fill() and
cpu_check() mainly uses mapping to flush cache and check/assign the
value.

There're 2 reasons why cpu_fill() and cpu_check() don't need to disable
pagefaults and preemption for mapping:

1. The flush operation is safe. cpu_fill() and cpu_check() call
drm_clflush_virt_range() to use CLFLUSHOPT or WBINVD to flush. Since
CLFLUSHOPT is global on x86 and WBINVD is called on each cpu in
drm_clflush_virt_range(), the flush operation is global.

2. Any context switch caused by preemption or page faults (page fault
may cause sleep) doesn't affect the validity of local mapping.

Therefore, cpu_fill() and cpu_check() are functions where the use of
kmap_local_page() in place of kmap_atomic() is correctly suited.

Convert the calls of kmap_atomic() / kunmap_atomic() to
kmap_local_page() / kunmap_local().

[1]: https://lore.kernel.org/all/20220813220034.806698-1-ira.weiny@xxxxxxxxx

v2:
* Dropped hot plug related description since it has nothing to do with
kmap_local_page().
* No code change since v1, and added description of the motivation of
using kmap_local_page().

Suggested-by: Dave Hansen <dave.hansen@xxxxxxxxx>
Suggested-by: Ira Weiny <ira.weiny@xxxxxxxxx>

First off I think this is fine.

But as I looked at this final selftests patch I began to wonder how the
memory being mapped here and in the previous selftests patches are
allocated. Does highmem need to be considered at all? Unfortunately, I
could not determine where the memory in the SG list of this test gem
object was allocated.

AFAICS cpu_fill() is only called in create_test_object(). Digging into
huge_gem_object() did not reveal where these pages were allocated from.

I wonder if these kmap_local_page() calls could be removed entirely based
on knowing that the pages were allocated from low mem? Removing yet
another user of highmem altogether would be best if possible.

Do you know how these test objects are created? Do the pages come from
user space somehow?

FWIW

create_test_object
-> huge_gem_object
-> i915_gem_object_init(obj, &huge_ops, &lock_class, 0);

Which is:

static const struct drm_i915_gem_object_ops huge_ops = {
.name = "huge-gem",
.get_pages = huge_get_pages,
.put_pages = huge_put_pages,
};

And:

huge_get_pages()
...
#define GFP (GFP_KERNEL | __GFP_NOWARN | __GFP_RETRY_MAYFAIL)
...
page = alloc_page(GFP | __GFP_HIGHMEM);


Regardless this is still a step in the right direction so:

Reviewed-by: Ira Weiny <ira.weiny@xxxxxxxxx>

Yeah LGTM.

FYI I am yet to read through the rest of the series, but I don't think there will be anything problematic and it passed our CI so likely is good to pull in.

Regards,

Tvrtko


Suggested-by: Fabio M. De Francesco <fmdefrancesco@xxxxxxxxx>
Signed-off-by: Zhao Liu <zhao1.liu@xxxxxxxxx>
---
Suggested by credits:
Dave: Referred to his explanation about cache flush.
Ira: Referred to his task document, review comments and explanation
about cache flush.
Fabio: Referred to his boiler plate commit message and his description
about why kmap_local_page() should be preferred.
---
drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
index a81fa6a20f5a..dcbc0b8e3323 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
@@ -481,12 +481,12 @@ static int cpu_fill(struct drm_i915_gem_object *obj, u32 value)
for (n = 0; n < real_page_count(obj); n++) {
u32 *map;
- map = kmap_atomic(i915_gem_object_get_page(obj, n));
+ map = kmap_local_page(i915_gem_object_get_page(obj, n));
for (m = 0; m < DW_PER_PAGE; m++)
map[m] = value;
if (!has_llc)
drm_clflush_virt_range(map, PAGE_SIZE);
- kunmap_atomic(map);
+ kunmap_local(map);
}
i915_gem_object_finish_access(obj);
@@ -512,7 +512,7 @@ static noinline int cpu_check(struct drm_i915_gem_object *obj,
for (n = 0; n < real_page_count(obj); n++) {
u32 *map, m;
- map = kmap_atomic(i915_gem_object_get_page(obj, n));
+ map = kmap_local_page(i915_gem_object_get_page(obj, n));
if (needs_flush & CLFLUSH_BEFORE)
drm_clflush_virt_range(map, PAGE_SIZE);
@@ -538,7 +538,7 @@ static noinline int cpu_check(struct drm_i915_gem_object *obj,
}
out_unmap:
- kunmap_atomic(map);
+ kunmap_local(map);
if (err)
break;
}
--
2.34.1