Re: [PATCH 0/3] More ARM DMA ops cleanup

From: Robin Murphy
Date: Wed Aug 31 2022 - 13:10:05 EST


On 2022-08-31 17:41, Yongqin Liu wrote:
Hi, Robin

On Tue, 30 Aug 2022 at 23:37, Robin Murphy <robin.murphy@xxxxxxx> wrote:

On 2022-08-30 16:19, Yongqin Liu wrote:
Hi, Robin

Thanks for the kind reply!

On Tue, 30 Aug 2022 at 17:48, Robin Murphy <robin.murphy@xxxxxxx> wrote:

On 2022-08-27 13:24, Yongqin Liu wrote:
Hi, Robin, Christoph

With the changes landed in the mainline kernel,
one problem is exposed with our out of tree pvr module.
Like the source here[1], arm_dma_ops.sync_single_for_cpu is called in
the format like the following:
arm_dma_ops.sync_single_for_cpu(NULL, pStart, pEnd - pStart,
DMA_FROM_DEVICE);

Not sure if you could give some suggestions on what I should do next
to make the pvr module work again.

Wow, that driver reinvents so many standard APIs for no apparent reason
it's not even funny.

Anyway, from a brief look it seemingly already knows how to call the DMA
API semi-correctly, so WTF that's doing behind an #ifdef, who knows?
However it's still so completely wrong in general - fundamentally broken
AArch64 set/way cache maintenance!? - that it looks largely beyond help.
"Throw CONFIG_DMA_API_DEBUG at it and cry" is about the extent of
support I'm prepared to provide for that mess.

For the moment, I do not care about the AArch64 lines, like if we only
say the following two lines:
arm_dma_ops.sync_single_for_device(NULL, pStart, pEnd - pStart,
DMA_TO_DEVICE);
arm_dma_ops.sync_single_for_cpu(NULL, pStart, pEnd - pStart,
DMA_FROM_DEVICE);

Could you please give some suggestions for that?

Remove them. Then remove the #ifdef __arch64__ too, since the code under
there is doing a passable impression of generic DMA API usage, as long
as one ignores the bigger picture.

I tried with this method, and found that if I only update for the
pvr_flush_range
and the pvr_clean_range functions, the build still could boot to the
home screen.

but if I update all the pvr_flush_range, pvr_clean_range and
pvr_invalidate_range
functions with this method(remove the arm_dma_ops lines and the #ifdef
__arch64__ lines),
then a "Unable to handle kernel NULL pointer dereference at virtual
address 0000003c"
error is reported like here: http://ix.io/49gu

Not sure if you have any idea from the log, or could you please give
some suggestions
on how to debug it.

Obviously there's almost certainly going to be more work to do on top to make the newly-exposed codepath actually behave as expected - I was simply making a general suggestion for a starting point based on looking at half a dozen lines of code in isolation.

To restate the point yet again in the hope that it's clear this time, the DMA ops on ARM are now effectively the same as the DMA ops on arm64, and will behave the same way. Assuming the driver already works on arm64, then the aim should be to unify all the ARM and arm64 codepaths for things that involve the DMA API. If you don't understand the code well enough to do that, please contact Imagination; I don't support their driver.

Thanks,
Robin.