[PATCH v2 00/21] Fix performance regressions with TLB and add GuC support

From: Mauro Carvalho Chehab
Date: Thu Jul 14 2022 - 08:06:44 EST


TLB invalidation is a slow operation. It should not be doing lightly, as it
causes performance regressions, like this:

[178.821002] i915 0000:00:02.0: [drm] *ERROR* rcs0 TLB invalidation did not complete in 4ms!

This series contain

1) some patches that makes TLB invalidation to happen only on
active, non-wedged engines, doing cache invalidation in batch
and only when GT objects are exposed to userspace:

drm/i915/gt: Ignore TLB invalidations on idle engines
drm/i915/gt: Only invalidate TLBs exposed to user manipulation
drm/i915/gt: Skip TLB invalidations once wedged
drm/i915/gt: Batch TLB invalidations
drm/i915/gt: Move TLB invalidation to its own file

2) It fixes two bugs, being the first a workaround:

drm/i915/gt: Invalidate TLB of the OA unit at TLB invalidations
drm/i915: Invalidate the TLBs on each GT

drm/i915/guc: Introduce TLB_INVALIDATION_ALL action

3) It adds GuC support. Besides providing TLB invalidation on some
additional hardware, this should also help serializing GuC operations
with TLB invalidation:

drm/i915/guc: Introduce TLB_INVALIDATION_ALL action
drm/i915/guc: Define CTB based TLB invalidation routines
drm/i915: Add platform macro for selective tlb flush
drm/i915: Define GuC Based TLB invalidation routines
drm/i915: Add generic interface for tlb invalidation for XeHP
drm/i915: Use selective tlb invalidations where supported

4) It adds the corresponding kernel-doc markups for the kAPI
used for TLB invalidation.

While I could have split this into smaller pieces, I'm opting to send
them altogether, in order for CI trybot to better verify what issues
will be closed with this series.

---

v2:
- no changes. Just rebased on the top of drm-tip: 2022y-07m-14d-08h-35m-36s,
as CI trybot was having troubles applying it. Hopefully, it will now work.

Chris Wilson (7):
drm/i915/gt: Ignore TLB invalidations on idle engines
drm/i915/gt: Invalidate TLB of the OA unit at TLB invalidations
drm/i915/gt: Only invalidate TLBs exposed to user manipulation
drm/i915/gt: Skip TLB invalidations once wedged
drm/i915/gt: Batch TLB invalidations
drm/i915/gt: Move TLB invalidation to its own file
drm/i915: Invalidate the TLBs on each GT

Mauro Carvalho Chehab (8):
drm/i915/gt: document with_intel_gt_pm_if_awake()
drm/i915/gt: describe the new tlb parameter at i915_vma_resource
drm/i915/guc: use kernel-doc for enum intel_guc_tlb_inval_mode
drm/i915/guc: document the TLB invalidation struct members
drm/i915: document tlb field at struct drm_i915_gem_object
drm/i915/gt: document TLB cache invalidation functions
drm/i915/guc: describe enum intel_guc_tlb_invalidation_type
drm/i915/guc: document TLB cache invalidation functions

Piotr Piórkowski (1):
drm/i915/guc: Introduce TLB_INVALIDATION_ALL action

Prathap Kumar Valsan (5):
drm/i915/guc: Define CTB based TLB invalidation routines
drm/i915: Add platform macro for selective tlb flush
drm/i915: Define GuC Based TLB invalidation routines
drm/i915: Add generic interface for tlb invalidation for XeHP
drm/i915: Use selective tlb invalidations where supported

drivers/gpu/drm/i915/Makefile | 1 +
.../gpu/drm/i915/gem/i915_gem_object_types.h | 6 +-
drivers/gpu/drm/i915/gem/i915_gem_pages.c | 28 +-
drivers/gpu/drm/i915/gt/intel_engine.h | 1 +
drivers/gpu/drm/i915/gt/intel_gt.c | 125 +-------
drivers/gpu/drm/i915/gt/intel_gt.h | 2 -
.../gpu/drm/i915/gt/intel_gt_buffer_pool.h | 3 +-
drivers/gpu/drm/i915/gt/intel_gt_defines.h | 11 +
drivers/gpu/drm/i915/gt/intel_gt_pm.h | 10 +
drivers/gpu/drm/i915/gt/intel_gt_regs.h | 8 +
drivers/gpu/drm/i915/gt/intel_gt_types.h | 22 +-
drivers/gpu/drm/i915/gt/intel_ppgtt.c | 8 +-
drivers/gpu/drm/i915/gt/intel_tlb.c | 295 ++++++++++++++++++
drivers/gpu/drm/i915/gt/intel_tlb.h | 30 ++
.../gpu/drm/i915/gt/uc/abi/guc_actions_abi.h | 54 ++++
drivers/gpu/drm/i915/gt/uc/intel_guc.c | 232 ++++++++++++++
drivers/gpu/drm/i915/gt/uc/intel_guc.h | 36 +++
drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 24 +-
drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h | 9 +
.../gpu/drm/i915/gt/uc/intel_guc_submission.c | 91 +++++-
drivers/gpu/drm/i915/i915_drv.h | 4 +-
drivers/gpu/drm/i915/i915_pci.c | 1 +
drivers/gpu/drm/i915/i915_vma.c | 46 ++-
drivers/gpu/drm/i915/i915_vma.h | 2 +
drivers/gpu/drm/i915/i915_vma_resource.c | 9 +-
drivers/gpu/drm/i915/i915_vma_resource.h | 6 +-
drivers/gpu/drm/i915/intel_device_info.h | 1 +
27 files changed, 910 insertions(+), 155 deletions(-)
create mode 100644 drivers/gpu/drm/i915/gt/intel_gt_defines.h
create mode 100644 drivers/gpu/drm/i915/gt/intel_tlb.c
create mode 100644 drivers/gpu/drm/i915/gt/intel_tlb.h

--
2.36.1