IMHO a WARN_ON is fine, and alerts to the possible machine corruptionI don't think you should restore, this should reflect a locking errorIn case of error, what if we pr_warn and put the device in blocking mode
but we still need to move forward and put some kind of correct
data.. The code can't go backwards so it should try to move forwards..
since we need to prevent malicious DMAs.
No need to do blocking, you should have a perfectly valid target DTE
that represents the state the HW is expected to be in. Resolve the
race by making it bin that state and move forwards.
Even if the IOMMU HW does a perfect 256 bit atomic read you still haveOn ordering, I don't know, is this OK?IOMMU hardware do not do partial interpret of the DTE and SW ensure DTE
If you are leaving/entering nesting mode I think you have to write the
[2] value in the right sequence, you don't want to have the viommu
enabled unless the host page table is setup properly. So [2] is
written last when enabling, and first when disabling. Flushes required
after each write to ensure the HW doesn't see a cross-128 word bit
tear.
GuestPagingMode also has to be sequenced correctly, the GCR3 tablepointer should be invalid when it is changed, meaning you have to
write it and flush before storing the GCR3 table, and the reverse to
undo it.
The ordering, including when DTE flushes are needed, is pretty
hard. This is much simpler than, say, ARM, so I think you could open
code it, but it should be a pretty sizable bit of logic to figure out
what to do.
flush after updating the DTE. Therefore, ordering should not be of a concern
here as long as the driver correctly program the entry.
to order the CPU writes correctly. It just means you don't need to
flush.
The guidelines in "2.2.2.2 Making Device Table Entry Changes" make
this clear. The indivudal CPU writes smaller than 256 bits have to be
sequenced right.
This section looks like it was written before translation bits were
placed in the other 128 bit word - it assumes a single 128 bit write
is always sufficient which isn't true anymore.
So you still have the issue of having to decide if you write 128 bit
[0] or [1] first.