Re: [PATCH v2 4/4] drm/panfrost: Handle non-aligned lock addresses

From: Steven Price
Date: Wed Aug 25 2021 - 05:04:18 EST


On 24/08/2021 18:30, Alyssa Rosenzweig wrote:
> When locking memory, the base address is rounded down to the nearest
> page. The current code does not adjust the size in this case,
> truncating the lock region:
>
> Input: [----size----]
> Round: [----size----]
>
> To fix the truncation, extend the lock region by the amount rounded off.
>
> Input: [----size----]
> Round: [-------size------]
>
> This bug is difficult to hit under current assumptions: since the size
> of the lock region is stored as a ceil(log2), the truncation must cause
> us to cross a power-of-two boundary. This is possible, for example if
> the caller tries to lock 65535 bytes starting at iova 0xCAFE0010. The
> existing code rounds down the iova to 0xCAFE0000 and rounds up the lock
> region to 65536 bytes, locking until 0xCAFF0000. This fails to lock the
> last 15 bytes.
>
> In practice, the current callers pass PAGE_SIZE aligned inputs, avoiding
> the bug. Therefore this doesn't need to be backported. Still, that's a
> happy accident and not a precondition of lock_region, so we let's do the
> right thing to future proof.

Actually it's worse than that due to the hardware behaviour, the spec
states (for LOCKADDR_BASE):

> Only the upper bits of the address are used. The address is aligned to a
> multiple of the region size, so a variable number of low-order bits are
> ignored, depending on the selected region size. It is recommended that software
> ensures that these low bits in the address are cleared, to avoid confusion.

It appears that indeed this has caused confusion ;)

So for a simple request like locking from 0xCAFE0000 - 0xCB010000 (size
= 0x30000) the region width gets rounded up (to 0x40000) which causes
the start address to be effectively rounded down (by the hardware) to
0xCAFC0000 and we fail to lock 0xCB000000-0xCB010000.

Interestingly (unless my reading of this is wrong) that means to lock
0xFFFF0000-0x100010000 (i.e. crossing the 4GB boundary) requires locking
*at least* 0x00000000-0x200000000 (i.e. locking the first 8GB).

This appears to be broken in kbase (which actually does zero out the low
bits of the address) - I've raised a bug internally so hopefully someone
will tell me if I've read the spec completely wrong here.

Steve

> Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@xxxxxxxxxxxxx>
> Reported-by: Steven Price <steven.price@xxxxxxx>
> ---
> drivers/gpu/drm/panfrost/panfrost_mmu.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c b/drivers/gpu/drm/panfrost/panfrost_mmu.c
> index dfe5f1d29763..14be32497ec3 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_mmu.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c
> @@ -63,6 +63,9 @@ static void lock_region(struct panfrost_device *pfdev, u32 as_nr,
> u8 region_width;
> u64 region = iova & PAGE_MASK;
>
> + /* After rounding the address down, extend the size to lock the end. */
> + size += (region - iova);
> +
> /* The size is encoded as ceil(log2) minus(1), which may be calculated
> * with fls. The size must be clamped to hardware bounds.
> */
>