Re: [PATCH v2 5/8] cxl/mem: Add a "RAW" send command

From: Dan Williams
Date: Thu Feb 11 2021 - 12:40:05 EST


On Wed, Feb 10, 2021 at 7:27 AM <Ariel.Sibley@xxxxxxxxxxxxx> wrote:
>
> > diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig
> > index c4ba3aa0a05d..08eaa8e52083 100644
> > --- a/drivers/cxl/Kconfig
> > +++ b/drivers/cxl/Kconfig
> > @@ -33,6 +33,24 @@ config CXL_MEM
> >
> > If unsure say 'm'.
> >
> > +config CXL_MEM_RAW_COMMANDS
> > + bool "RAW Command Interface for Memory Devices"
> > + depends on CXL_MEM
> > + help
> > + Enable CXL RAW command interface.
> > +
> > + The CXL driver ioctl interface may assign a kernel ioctl command
> > + number for each specification defined opcode. At any given point in
> > + time the number of opcodes that the specification defines and a device
> > + may implement may exceed the kernel's set of associated ioctl function
> > + numbers. The mismatch is either by omission, specification is too new,
> > + or by design. When prototyping new hardware, or developing /
> > debugging
> > + the driver it is useful to be able to submit any possible command to
> > + the hardware, even commands that may crash the kernel due to their
> > + potential impact to memory currently in use by the kernel.
> > +
> > + If developing CXL hardware or the driver say Y, otherwise say N.
>
> Blocking RAW commands by default will prevent vendors from developing user space tools that utilize vendor specific commands. Vendors of CXL.mem devices should take ownership of ensuring any vendor defined commands that could cause user data to be exposed or corrupted are disabled at the device level for shipping configurations.

What follows is my personal opinion as a Linux kernel developer, not
necessarily the opinion of my employer...

Aside from the convention that new functionality is always default N
it is the Linux distributor that decides the configuration. In an
environment where the kernel is developing features like
CONFIG_SECURITY_LOCKDOWN_LSM that limit the ability of the kernel to
subvert platform features like secure boot, it is incumbent upon
drivers to evaluate what they must do to protect platform integrity.
See the ongoing tightening of /dev/mem like interfaces for an example
of the shrinking ability of root to have unfettered access to all
platform/hardware capabilities.

CXL is unique in that it impacts "System RAM" resources and that it
interleaves multiple devices. Compare this to NVME where the blast
radius of misbehavior is contained to an endpoint and is behind an
IOMMU. The larger impact to me increases the responsibility of CXL
enabling to review system impacts and vendor specific functionality is
typically unreviewable.

There are 2 proposals I can see to improve the unreviewable problem.
First, of course, get commands into the standard proper. One strawman
proposal is to take the "Code First" process that seems to be working
well for the ACPI and UEFI working groups and apply it to CXL command
definitions. That vastly shortens the time between proposal and Linux
enabling. The second proposal is to define a mechanism for de-facto
standards to develop. That need I believe was the motivation for
"designated vendor-specific" in the first instance? I.e. to share
implementations across vendors pre-standardization.

So, allocate a public id for the command space, publish a public
specification, and then send kernel patches. This was the process for
accepting command sets outside of ACPI into the LIBNVDIMM subsystem.
See drivers/acpi/nfit/nfit.h for the reference to the public command
sets.