Re: [PATCH v2 0/5] iommu/vt-d: Ensure memory ordering in context & root entry updates

From: Baolu Lu
Date: Mon Jan 05 2026 - 22:37:24 EST


On 1/6/26 02:12, Jason Gunthorpe wrote:
On Sat, Dec 27, 2025 at 06:57:23PM +0100, Dmytro Maluka wrote:
As discussed in [1], we don't currently prevent the compiler from
reordering memory writes when updating context entries, which is
potentially dangerous, as it may cause setting the present bit (i.e.
enabling DMA translation for the given device) before finishing setting
up other bits in the context entry (and thus creating a time window when
a DMA from the device may result in an unpredicted behavior).

Fix this in the same way as how this is already addressed for PASID
entries, i.e. by using READ_ONCE/WRITE_ONCE in the helpers used for
setting individual bits in context entries, so that memory writes done
by those helpers are ordered in relation to each other (plus, prevent
load/store tearing and so on).

While at it, similarly paranoidally fix updating root entries as well:
use WRITE_ONCE to make sure that the present bit is set atomically
together with the context table address bits, not before them.
The PASID entries should not be manipulated 'livel' in a haphazard way
like this in the first place!

Like AMD and ARM build the new PASID entry on the stack and then it
should be copied to the DMA'able memory in a way that is consistent
with the HW's atomicity granual, paying attention not to 'tear' it.

This manipulate-in-place is just asking for trouble, and can never
support replace or full viommu requirements.. :\

So while it is perhaps an improvement to do this work, it would be
better to fix the root cause issue if someone has time..

Agreed. The current Intel IOMMU driver uses a 'clear-populate-set'
pattern protected by a spinlock, which is why it doesn't support
'replace' yet. Dmytro's patch addresses the immediate risk of the
compiler reordering those writes and exposing invalid data to the
hardware.

Moving to an on-stack construction (like AMD/ARM) and updating
atomically is the right direction for the driver. We'll look into that
refactoring as a follow-up series to modernize the entry manipulation
logic.

Thanks,
baolu