Re: [PATCH 08/14] taint: add taint for direct hardware access

From: Dan Williams
Date: Mon Feb 01 2021 - 14:02:28 EST


On Mon, Feb 1, 2021 at 10:35 AM Ben Widawsky <ben.widawsky@xxxxxxxxx> wrote:
>
> On 21-02-01 13:18:45, Konrad Rzeszutek Wilk wrote:
> > On Fri, Jan 29, 2021 at 04:24:32PM -0800, Ben Widawsky wrote:
> > > For drivers that moderate access to the underlying hardware it is
> > > sometimes desirable to allow userspace to bypass restrictions. Once
> > > userspace has done this, the driver can no longer guarantee the sanctity
> > > of either the OS or the hardware. When in this state, it is helpful for
> > > kernel developers to be made aware (via this taint flag) of this fact
> > > for subsequent bug reports.
> > >
> > > Example usage:
> > > - Hardware xyzzy accepts 2 commands, waldo and fred.
> > > - The xyzzy driver provides an interface for using waldo, but not fred.
> > > - quux is convinced they really need the fred command.
> > > - xyzzy driver allows quux to frob hardware to initiate fred.
> >
> > Would it not be easier to _not_ frob the hardware for fred-operation?
> > Aka not implement it or just disallow in the first place?
>
> Yeah. So the idea is you either are in a transient phase of the command and some
> future kernel will have real support for fred - or a vendor is being short
> sighted and not adding support for fred.
>
> >
> >
> > > - kernel gets tainted.
> > > - turns out fred command is borked, and scribbles over memory.
> > > - developers laugh while closing quux's subsequent bug report.
> >
> > Yeah good luck with that theory in-the-field. The customer won't
> > care about this and will demand a solution for doing fred-operation.
> >
> > Just easier to not do fred-operation in the first place,no?
>
> The short answer is, in an ideal world you are correct. See nvdimm as an example
> of the real world.
>
> The longer answer. Unless we want to wait until we have all the hardware we're
> ever going to see, it's impossible to have a fully baked, and validated
> interface. The RAW interface is my admission that I make no guarantees about
> being able to provide the perfect interface and giving the power back to the
> hardware vendors and their driver writers.
>
> As an example, suppose a vendor shipped a device with their special vendor
> opcode. They can enable their customers to use that opcode on any driver
> version. That seems pretty powerful and worthwhile to me.
>

Powerful, frightening, and questionably worthwhile when there are
already examples of commands that need extra coordination for whatever
reason. However, I still think the decision tilts towards allowing
this given ongoing spec work.

NVDIMM ended up allowing unfettered vendor passthrough given the lack
of an organizing body to unify vendors. CXL on the other hand appears
to have more gravity to keep vendors honest. A WARN splat with a
taint, and a debugfs knob for the truly problematic commands seems
sufficient protection of system integrity while still following the
Linux ethos of giving system owners enough rope to make their own
decisions.

> Or a more realistic example, we ship a driver that adds a command which is
> totally broken. Customers can utilize the RAW interface until it gets fixed in a
> subsequent release which might be quite a ways out.
>
> I'll say the RAW interface isn't an encouraged usage, but it's one that I expect
> to be needed, and if it's not we can always try to kill it later. If nobody is
> actually using it, nobody will complain, right :D

It might be worthwhile to make RAW support a compile time decision so
that Linux distros can only ship support for the commands the CXL
driver-dev community has blessed, but I'll leave it to a distro
developer to second that approach.