Re: [PATCH v1 1/1] spi: Remove unneded check for orig_nents

From: Neil Armstrong
Date: Wed May 22 2024 - 09:18:52 EST


Hi,

On 22/05/2024 13:53, Neil Armstrong wrote:
On 22/05/2024 13:33, Andy Shevchenko wrote:
On Wed, May 22, 2024 at 12:03:33PM +0200, Neil Armstrong wrote:
On 15/05/2024 23:09, Nícolas F. R. A. Prado wrote:
On Tue, May 07, 2024 at 11:10:27PM +0300, Andy Shevchenko wrote:
Both dma_unmap_sgtable() and sg_free_table() in spi_unmap_buf_attrs()
have checks for orig_nents against 0. No need to duplicate this.
All the same applies to other DMA mapping API calls.

Also note, there is no other user in the kernel that does this kind of
checks.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@xxxxxxxxxxxxxxx>

Hi,

this commit caused a regression which I reported here:

https://lore.kernel.org/all/d3679496-2e4e-4a7c-97ed-f193bd53af1d@notapiano

along with some thoughts on the cause and a possible solution, though I'm not
familiar with this code base at all and would really appreciate any feedback you
may have.

I also see the same regression on the SM8550 and SM8650 platforms,
please CC linux-arm-msm@xxxxxxxxxxxxxxx and me for a potential fix to test on those platforms.

There is still no answer from IOMMU patch author. Do you have the same trace
due to IOMMU calls? Anyway, I guess it would be nice to see it.

Yes :
[    6.404623] Unable to handle kernel NULL pointer dereference at virtual address 000000000000001c
<snip>
[    6.641597] lr : __dma_sync_sg_for_device+0x3c/0x40
<snip>
[    6.688286] Call trace:
[    6.688287]  iommu_dma_sync_sg_for_device+0x28/0x100
[    6.717582]  __dma_sync_sg_for_device+0x3c/0x40
[    6.717585]  spi_transfer_one_message+0x358/0x680
[    6.732229]  __spi_pump_transfer_message+0x188/0x494
[    6.732232]  __spi_sync+0x2a8/0x3c4
[    6.732234]  spi_sync+0x30/0x54
[    6.732236]  goodix_berlin_spi_write+0xf8/0x164 [goodix_berlin_spi]
[    6.739854]  _regmap_raw_write_impl+0x538/0x674
[    6.750053]  _regmap_raw_write+0xb4/0x144
[    6.750056]  regmap_raw_write+0x7c/0xc0
[    6.750058]  goodix_berlin_power_on+0xb0/0x1b0 [goodix_berlin_core]
[    6.765520]  goodix_berlin_probe+0xc0/0x660 [goodix_berlin_core]
[    6.765522]  goodix_berlin_spi_probe+0x12c/0x14c [goodix_berlin_spi]
[    6.772339]  spi_probe+0x84/0xe4
[    6.772342]  really_probe+0xbc/0x29c
[    6.784313]  __driver_probe_device+0x78/0x12c
[    6.784316]  driver_probe_device+0x3c/0x15c
[    6.784319]  __driver_attach+0x90/0x19c
[    6.784322]  bus_for_each_dev+0x7c/0xdc
[    6.794520]  driver_attach+0x24/0x30
[    6.794523]  bus_add_driver+0xe4/0x208
[    6.794526]  driver_register+0x5c/0x124
[    6.802586]  __spi_register_driver+0xa4/0xe4
[    6.802589]  goodix_berlin_spi_driver_init+0x20/0x1000 [goodix_berlin_spi]
[    6.802591]  do_one_initcall+0x80/0x1c8
[    6.902310]  do_init_module+0x60/0x218
[    6.921988]  load_module+0x1bcc/0x1d8c
[    6.925847]  init_module_from_file+0x88/0xcc
[    6.930238]  __arm64_sys_finit_module+0x1dc/0x2e4
[    6.935074]  invoke_syscall+0x48/0x114
[    6.938944]  el0_svc_common.constprop.0+0xc0/0xe0
[    6.943781]  do_el0_svc+0x1c/0x28
[    6.947195]  el0_svc+0x34/0xd8
[    6.950348]  el0t_64_sync_handler+0x120/0x12c
[    6.954833]  el0t_64_sync+0x190/0x194
[    6.958600] Code: 2a0203f5 2a0303f6 a90363f7 aa0003f7 (b9401c2

Reverting  8cc3bad9d9d6 ("spi: Remove unneded check for orig_nents") removes the crash.


Meanwhile, I have three changes I posted in the replies to the initial report,
can you combine them all and test? This will be a plan B (? or A, depending on
the culprit).


I'll try to apply them and test.

I stacked the 3 changes, and it works:
Tested-by: Neil Armstrong <neil.armstrong@xxxxxxxxxx> # on SM8650-QRD

For reference, the changeset looks like:
============><============================================================================================
diff --git a/drivers/spi/spi.c b/drivers/spi/spi.c
index 289feccca376..0851c5e1fd1f 100644
--- a/drivers/spi/spi.c
+++ b/drivers/spi/spi.c
@@ -1220,6 +1220,11 @@ void spi_unmap_buf(struct spi_controller *ctlr, struct device *dev,
spi_unmap_buf_attrs(ctlr, dev, sgt, dir, 0);
}

+/* Dummy SG for unidirect transfers */
+static struct scatterlist dummy_sg = {
+ .page_link = SG_END,
+};
+
static int __spi_map_msg(struct spi_controller *ctlr, struct spi_message *msg)
{
struct device *tx_dev, *rx_dev;
@@ -1243,6 +1248,7 @@ static int __spi_map_msg(struct spi_controller *ctlr, struct spi_message *msg)
else
rx_dev = ctlr->dev.parent;

+ ret = -ENOMSG;
list_for_each_entry(xfer, &msg->transfers, transfer_list) {
/* The sync is done before each transfer. */
unsigned long attrs = DMA_ATTR_SKIP_CPU_SYNC;
@@ -1257,6 +1263,9 @@ static int __spi_map_msg(struct spi_controller *ctlr, struct spi_message *msg)
attrs);
if (ret != 0)
return ret;
+ } else {
+ memset(&xfer->tx_sg, 0, sizeof(xfer->tx_sg));
+ xfer->tx_sg.sgl = &dummy_sg;
}

if (xfer->rx_buf != NULL) {
@@ -1270,8 +1279,14 @@ static int __spi_map_msg(struct spi_controller *ctlr, struct spi_message *msg)

return ret;
}
+ } else {
+ memset(&xfer->rx_sg, 0, sizeof(xfer->rx_sg));
+ xfer->rx_sg.sgl = &dummy_sg;
}
}
+ /* No transfer has been mapped, bail out with success */
+ if (ret)
+ return 0;

ctlr->cur_rx_dma_dev = rx_dev;
ctlr->cur_tx_dma_dev = tx_dev;
============><============================================================================================

Thanks,
Neil

Neil