Re: Implementing .shutdown method for efa module

From: Margolin, Michael
Date: Mon Apr 01 2024 - 09:24:14 EST


Jason

Thanks for your response, efa_remove() is performing reset to the device which should stop all DMA from the device.

Except skipping cleanups that are unnecessary for shutdown flow are there any other reasons to prefer a separate function for shutdown?


Michael

On 3/26/2024 5:32 PM, Jason Gunthorpe wrote:
CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.



On Tue, Mar 26, 2024 at 02:34:45PM +0200, Margolin, Michael wrote:
Hi Tao,

Thanks for bringing this up.

I've unsuccessfully tried to reproduce this kernel panic using production
Red Hat 9.3 AMI (5.14.0-362.18.1.el9_3.aarch64).

Are there any related changes in the kernel you are testing?

Anyways we do need to handle shutdown properly, please let know if calling
to efa_remove solves your issue.
efa_remove should not be used for shutdown..

If you have an iommu in your system (smmuv3 for this ARM64 case) then
drivers must implement a shutdown handler or you will risk data
corruption on ARM64 sytems during crash.

The shutdown handler must stop all DMA from the device.

If you don't have an iommu then the shutdown handler shouldn't be
critical.

Jason