Re: [PATCHv5 4/6] drm/msm/a6xx: Add support for using system cache(LLC)

From: Jordan Crouse
Date: Mon Sep 28 2020 - 12:11:37 EST


On Mon, Sep 28, 2020 at 05:56:55PM +0530, Sai Prakash Ranjan wrote:
> Hi Jordan,
>
> On 2020-09-23 20:33, Jordan Crouse wrote:
> >On Tue, Sep 22, 2020 at 11:48:17AM +0530, Sai Prakash Ranjan wrote:
> >>From: Sharat Masetty <smasetty@xxxxxxxxxxxxxx>
> >>
> >>The last level system cache can be partitioned to 32 different
> >>slices of which GPU has two slices preallocated. One slice is
> >>used for caching GPU buffers and the other slice is used for
> >>caching the GPU SMMU pagetables. This talks to the core system
> >>cache driver to acquire the slice handles, configure the SCID's
> >>to those slices and activates and deactivates the slices upon
> >>GPU power collapse and restore.
> >>
> >>Some support from the IOMMU driver is also needed to make use
> >>of the system cache to set the right TCR attributes. GPU then
> >>has the ability to override a few cacheability parameters which
> >>it does to override write-allocate to write-no-allocate as the
> >>GPU hardware does not benefit much from it.
> >>
> >>DOMAIN_ATTR_SYS_CACHE is another domain level attribute used by the
> >>IOMMU driver to set the right attributes to cache the hardware
> >>pagetables into the system cache.
> >>
> >>Signed-off-by: Sharat Masetty <smasetty@xxxxxxxxxxxxxx>
> >>[saiprakash.ranjan: fix to set attr before device attach to iommu and
> >>rebase]
> >>Signed-off-by: Sai Prakash Ranjan <saiprakash.ranjan@xxxxxxxxxxxxxx>
> >>---
> >> drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 83 +++++++++++++++++++++++++
> >> drivers/gpu/drm/msm/adreno/a6xx_gpu.h | 4 ++
> >> drivers/gpu/drm/msm/adreno/adreno_gpu.c | 17 +++++
> >> 3 files changed, 104 insertions(+)
> >>
> >>diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> >>b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> >>index 8915882e4444..151190ff62f7 100644
> >>--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> >>+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> >>@@ -8,7 +8,9 @@
> >> #include "a6xx_gpu.h"
> >> #include "a6xx_gmu.xml.h"
> >>
> >>+#include <linux/bitfield.h>
> >> #include <linux/devfreq.h>
> >>+#include <linux/soc/qcom/llcc-qcom.h>
> >>
> >> #define GPU_PAS_ID 13
> >>
> >>@@ -1022,6 +1024,79 @@ static irqreturn_t a6xx_irq(struct msm_gpu *gpu)
> >> return IRQ_HANDLED;
> >> }
> >>
> >>+static void a6xx_llc_rmw(struct a6xx_gpu *a6xx_gpu, u32 reg, u32 mask,
> >>u32 or)
> >>+{
> >>+ return msm_rmw(a6xx_gpu->llc_mmio + (reg << 2), mask, or);
> >>+}
> >>+
> >>+static void a6xx_llc_write(struct a6xx_gpu *a6xx_gpu, u32 reg, u32
> >>value)
> >>+{
> >>+ return msm_writel(value, a6xx_gpu->llc_mmio + (reg << 2));
> >>+}
> >>+
> >>+static void a6xx_llc_deactivate(struct a6xx_gpu *a6xx_gpu)
> >>+{
> >>+ llcc_slice_deactivate(a6xx_gpu->llc_slice);
> >>+ llcc_slice_deactivate(a6xx_gpu->htw_llc_slice);
> >>+}
> >>+
> >>+static void a6xx_llc_activate(struct a6xx_gpu *a6xx_gpu)
> >>+{
> >>+ u32 cntl1_regval = 0;
> >>+
> >>+ if (IS_ERR(a6xx_gpu->llc_mmio))
> >>+ return;
> >>+
> >>+ if (!llcc_slice_activate(a6xx_gpu->llc_slice)) {
> >>+ u32 gpu_scid = llcc_get_slice_id(a6xx_gpu->llc_slice);
> >>+
> >>+ gpu_scid &= 0x1f;
> >>+ cntl1_regval = (gpu_scid << 0) | (gpu_scid << 5) | (gpu_scid << 10) |
> >>+ (gpu_scid << 15) | (gpu_scid << 20);
> >>+ }
> >>+
> >>+ if (!llcc_slice_activate(a6xx_gpu->htw_llc_slice)) {
> >>+ u32 gpuhtw_scid = llcc_get_slice_id(a6xx_gpu->htw_llc_slice);
> >>+
> >>+ gpuhtw_scid &= 0x1f;
> >>+ cntl1_regval |= FIELD_PREP(GENMASK(29, 25), gpuhtw_scid);
> >>+ }
> >>+
> >>+ if (cntl1_regval) {
> >>+ /*
> >>+ * Program the slice IDs for the various GPU blocks and GPU MMU
> >>+ * pagetables
> >>+ */
> >>+ a6xx_llc_write(a6xx_gpu, REG_A6XX_CX_MISC_SYSTEM_CACHE_CNTL_1,
> >>cntl1_regval);
> >>+
> >>+ /*
> >>+ * Program cacheability overrides to not allocate cache lines on
> >>+ * a write miss
> >>+ */
> >>+ a6xx_llc_rmw(a6xx_gpu, REG_A6XX_CX_MISC_SYSTEM_CACHE_CNTL_0, 0xF,
> >>0x03);
> >>+ }
> >>+}
> >
> >This code has been around long enough that it pre-dates a650. On a650 and
> >other
> >MMU-500 targets the htw_llc is configured by the firmware and the
> >llc_slice is
> >configured in a different register.
> >
> >I don't think we need to pause everything and add support for the MMU-500
> >path,
> >but we do need a way to disallow LLCC on affected targets until such time
> >that
> >we can get it fixed up.
> >
>
> Thanks for taking a close look, does something like below look ok or
> something
> else is needed here?
>
> + /* Till the time we get in LLCC support for A650 */
> + if (!(info && info->revn == 650))
> + a6xx_llc_slices_init(pdev, a6xx_gpu);

It doesn't look like Rob picked this up for 5.10, so we have some time to do it
right. Would you like me to give you an add-on patch for mmu-500 targets?

Jordan

> Thanks,
> Sai
>
> --
> QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
> of Code Aurora Forum, hosted by The Linux Foundation

--
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project