Re: RFC: Restricting userspace interfaces for CXL fabric management
From: Dan Williams
Date: Tue Apr 23 2024 - 20:08:25 EST
Jonathan Cameron wrote:
[..]
> > It is not clear to me that this material makes sense to house in
> > drivers/ vs tools/ or even out-of-tree just for maintenance burden
> > relief of keeping the universes separated. What does the Linux kernel
> > project get out of carrying this in mainline alongside the inband code?
>
> I'm not sure what you mean by in band. Aim here was to discuss
> in-band drivers for switch CCI etc. Same reason from a kernel point of
> view for why we include embedded drivers. I'll interpret in band
> as host driven and not inband as FM-API stuff.
>
> > I do think the mailbox refactoring to support non-CXL use cases is
> > interesting, but only so far as refactoring is consumed for inband use
> > cases like RAS API.
>
> If I read this right, I disagree with the 'only so far' bit.
>
> In all substantial ways we should support BMC use case of the Linux Kernel
> at a similar level to how we support forms of Linux Distros.
I think we need to talk in terms of specifics, because in the general
case I do not see the blockage. OpenBMC currently is based on v6.6.28
and carries 136 patches. An additional patch to turn off raw commands
restrictions over there would not even be noticed.
> It may not be our target market as developers for particular parts of
> our companies, but we should not block those who want to support it.
It is also the case that there is a responsibility to build maintainable
kernel interfaces that can be reasoned about, especially with devices as
powerful as CXL that are trusted to host system memory and be caching
agents. For example, I do not want to be in the position of auditing
whether proposed tunnels and passthroughs violate lockdown expectations.
Also, the assertion that these kernels will be built with
CONFIG_SECURITY_LOCKDOWN_LSM=n and likely CONFIG_STRICT_DEVMEM=n, then
the entire user-mode driver ABI is available for use. CXL commands are
simple polled mmio, does Linux really benefit from carrying drivers in
the kernel that the kernel itself does not care about?
[..]
> Switch CCI Driver: PCI driver doing everything beyond the CXL mbox specific bit.
> Type 3 Stack: All the normal stack just with the CXL Mailbox specific stuff factored
> out. Note we can move different amounts of shared logic in here, but
> in essence it deals with the extra layer on top of the raw MMPT mbox.
> MMPT Mbox: Mailbox as per the PCI spec.
> RAS API: Shared RAS API specific infrastructure used by other drivers.
Once the CXL mailbox core is turned into a library for kernel internal
consumers, like RAS API, or CXL accelerators, then it becomes easier to
add a Switch CCI consumer (perhaps as an out-of-tree module in tools/),
but it is still not clear why the kernel benefits from that arrangement.
This is less about blocking developers that have different goals it is
about finding the right projects / places to solve the problem
especially when disjoint design goals are in play and user space drivers
might be in reach.
[..]
> > > The various CXL upstream developers and maintainers may have
> > > differing views of course, but my current understanding is we want
> > > to support 1 and 2, but are very resistant to 3!
> >
> > 1, yes, 2, need to see the patches, and agree on 3.
>
> If we end up with top architecture of the diagrams above, 2 will look pretty
> similar to last version of the switch-cci patches. So raw commands only + taint.
> Factoring out MMPT is another layer that doesn't make that much difference in
> practice to this discussion. Good to have, but the reuse here would be one layer
> above that.
>
> Or we just say go for second proposed architecture and 0 impact on the
> CXL specific code, just reuse of the MMPT layer. I'd imagine people will get
> grumpy on code duplication (and we'll spend years rejecting patch sets that
> try to share the cdoe) but there should be no maintenance burden as
> a result.
I am assuming that the shared code between MMPT and CXL will happen and
that all of the command infrastructure is where centralized policy can
not keep up. If OpenBMC wants to land a driver that consumes the MMPT
core in tools/ that would seem to satisfy both the concerns of mainline
not shipping ABI that host kernels need to strictly reason about while
letting OpenBMC not need to carry out-of-tree patches indefinitely.