Re: Why is the ARM SMMU v1/v2 put into bypass mode on kexec?

From: Robin Murphy
Date: Tue Apr 02 2024 - 12:33:48 EST


On 2024-03-22 3:51 pm, Will Deacon wrote:
On Tue, Mar 19, 2024 at 06:17:39PM +0000, Robin Murphy wrote:
In terms of the shutdown behaviour, I think it actually works out as-is. For
the normal case we haven't touched GBPA, so we are truly returning to the
boot-time condition; in the unexpected case where SMMUEN was already enabled
then we'll go into an explicit GPBA abort state, but that seems a
not-unreasonable compromise for not preserving the entire boot-time Stream
Table etc., whose presence kind of implies it wouldn't have been bypassing
everything anyway.

The more I look at the remaining aspect of disable_bypass for controlling
broken-DT behaviour the more I suspect it can't actually be useful either
way, especially not since default domains. I have no memory of what my
original reasoning might have been, so I'm inclined to just rip that all out
and let probe fail. I see no reason these days not to expect a broken DT to
leads to a broken system, especially not now with DTSchema validation.

That sounds reasonable to me, although we may end up having to back it
out if we regress systems with borked firmware :(

Then there's just the kdump warning it suppresses, of which I also have no
idea why it's there either, but apparently that one's on you :P

I think _that_ one is because the previous (crashed) kernel won't have
torn anything down, so there could be active DMA using translations in
the SMMU. In that case, the crashkernel (which is running from some
carveout) may find the SMMU enabled, but it really can't stick it into
bypass mode because that's likely to corrupt random memory. So in that
case, we do stick it into abort before we reinitialise it and then we
disabling fault reporting altogether to avoid the log spam:

if (is_kdump_kernel())
enables &= ~(CR0_EVTQEN | CR0_PRIQEN)

Oh, I know why we do what we do for the kdump situation in general - it was merely the matter of why we chose to demand that the user explicitly tells us to do what we know is the right thing (and scream at them if they don't), rather than to just go ahead and do the right thing anyway.

(the significance of disable_bypass for kdump is after we turn the SMMU back on from GBPA Abort state - we don't want any ongoing traffic being able to inadvertently bypass via an STE config either)

Cheers,
Robin.