Hi Jon,
On Wed, 29 Jan 2025 14:51:35 +0000, Jon Hunter <jonathanh@xxxxxxxxxx> wrote:
Hi Furong,
On 27/01/2025 13:28, Thierry Reding wrote:
On Sat, Jan 25, 2025 at 11:03:47PM +0800, Furong Xu wrote:
Hi Thierry
On Sat, 25 Jan 2025 12:20:38 +0200, Ido Schimmel wrote:
On Fri, Jan 24, 2025 at 10:42:56AM +0800, Furong Xu wrote:
On Thu, 23 Jan 2025 22:48:42 +0100, Andrew Lunn <andrew@xxxxxxx>
wrote:
Just to clarify, the patch that you had us try was not intended
as an actual fix, correct? It was only for diagnostic purposes,
i.e. to see if there is some kind of cache coherence issue,
which seems to be the case? So perhaps the only fix needed is
to add dma-coherent to our device tree?
That sounds quite error prone. How many other DT blobs are
missing the property? If the memory should be coherent, i would
expect the driver to allocate coherent memory. Or the driver
needs to handle non-coherent memory and add the necessary
flush/invalidates etc.
stmmac driver does the necessary cache flush/invalidates to
maintain cache lines explicitly.
Given the problem happens when the kernel performs syncing, is it
possible that there is a problem with how the syncing is performed?
I am not familiar with this driver, but it seems to allocate multiple
buffers per packet when split header is enabled and these buffers are
allocated from the same page pool (see stmmac_init_rx_buffers()).
Despite that, the driver is creating the page pool with a non-zero
offset (see __alloc_dma_rx_desc_resources()) to avoid syncing the
headroom, which is only present in the head buffer.
I asked Thierry to test the following patch [1] and initial testing
seems OK. He also confirmed that "SPH feature enabled" shows up in the
kernel log.
It is recommended to disable the "SPH feature" by default unless some
certain cases depend on it. Like Ido said, two large buffers being
allocated from the same page pool for each packet, this is a huge waste
of memory, and brings performance drops for most of general cases.
Our downstream driver and two mainline drivers disable SPH by default:
https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/tree/drivers/net/ethernet/stmicro/stmmac/dwmac-dwc-qos-eth.c#n357
https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/tree/drivers/net/ethernet/stmicro/stmmac/dwmac-intel.c#n471
Okay, that's something we can look into changing. What would be an
example of a use-case depending on SPH? Also, isn't this something
that should be a policy that users can configure?
Irrespective of that we should fix the problems we are seeing with
SPH enabled.
Any update on this?
Sorry for my late response, I was on Chinese New Year holiday.
The fix is sent, and it will be so nice to have your Tested-by: tag there:
https://lore.kernel.org/all/20250207085639.13580-1-0x1207@xxxxxxxxx/