Re: [PATCH v4 0/2] Support zero-sized HDM decoders

From: Gregory Price

Date: Tue Jun 23 2026 - 13:02:20 EST

On Tue, Jun 09, 2026 at 04:13:58PM -0700, Dan Williams (nvidia) wrote:
> Richard Cheng wrote:
> > Hello,
> >
> > This v4 continues Vishal Aslot's "Support zero-sized decoders" series [1]
> > and addresses the v3 review of patch 1's port->hdm_end handling [2].
> >
> > CXL r3.2 §8.2.4.20.12 and §14.13.10 permit committing an HDM decoder with
> > size 0. BIOS commits and LOCKs such decoders to burn the trailing, unused
> > slots so the OS cannot program regions through them, e.g. a Type 3 device
> > in a Trusted Computing Base (TCB) established via the Trusted Security
> > Protocol (TSP). init_hdm_decoder() rejected these with -ENXIO during port
> > enumeration and aborted the whole port, so affected systems showed nothing
> > under 'cxl list'.
> >
> > Patch 1 enumerates the decoder into the topology with its HW-reported LOCK
> > state and skips the DPA reservation it does not need.
> >
> > On port->hdm_end (the v3 review): v3 advanced the watermark for the
> > zero-size decoder. sashiko correctly noted the write was outside
> > cxl_rwsem.dpa, and that advancing it without a balanced release strands
> > hdm_end -- cxl_dpa_free() returns early on !dpa_res, so it can never be
> > decremented past the zero-size id, breaking LIFO teardown of lower
> > decoders. v4 therefore does not touch hdm_end at all. The in-order check
> > in __cxl_dpa_reserve() is its only consumer and is never legitimately
> > reached past such a decoder: the burned slots are trailing, so enumeration
> > reserves no committed decoder after one, and the OS must not program a
> > region through a locked slot. hdm_end stays at the last sized reservation,
> > which is accurate. IMHO, if a non-trailing zero-size layout ever needs
> > support, the check should key off commit_end rather than hdm_end,
> > out of scope here.
>
> I am not comfortable with this outcome. It assumes that zero-sized
> decoders are always committed. I would much rather keep the meaning of
> hdm_end as the marker of the last decoder set aside for a reservation.
>

(pre: before a program[mable] decoder, post: after ...)

Pre-locked zero-sized decoders *must* be committed if the non-zero
decoders are programmable.

Post-locked zero-sized committed decoders *are not legal* if the
non-zero decoders are programmable. They can only be legal IFF the
entire device came up with decoders programmed and locked.

Violating either condition implies an out of order commit has
occurred, and trying to deal with zero-sized decoders as a
special class is just a giant footgun.

Covered this here:
https://lore.kernel.org/linux-cxl/aPeSqjqU6BH9gvcw@gourry-fedora-PF4VCD3F/
https://lore.kernel.org/linux-cxl/aYynWqJ7u-v-6WsZ@gourry-fedora-PF4VCD3F/

So,I agree that zero-sized decoders can't be assumed to be committed.

Consider this case:

Pre-lock Post-lock
decoder 0 1 2 ... N
------------------------------------------------------------------
[zero-lock] [programmed] [zero-lock] [zero-lock]
^ must be committed ^ must not be committed
unless D1 is locked

Anyway, agree, zero-sized decoders cannot be assumed to always be
commited, and the spec (or at least my last reading) leaves it
ambiguous what state a zero-sized decoder must be in.

~Gregory