Re: [PATCH] PCI: Allow drivers to claim exclusive access to config regions

From: Dan Williams
Date: Thu May 13 2021 - 17:28:13 EST


On Sat, Mar 27, 2021 at 3:47 AM Greg Kroah-Hartman
<gregkh@xxxxxxxxxxxxxxxxxxx> wrote:
>
> On Fri, Mar 26, 2021 at 11:12:47AM -0500, Bjorn Helgaas wrote:
> > [+cc Christoph]
> >
> > On Wed, Mar 24, 2021 at 06:23:54PM -0700, Dan Williams wrote:
> > > The PCIE Data Object Exchange (DOE) mailbox is a protocol run over
> > > configuration cycles. It assumes one initiator at a time is
> > > reading/writing the data registers. If userspace reads from the response
> > > data payload it may steal data that a kernel driver was expecting to
> > > read. If userspace writes to the request payload it may corrupt the
> > > request a driver was trying to send.
> >
> > IIUC the problem we're talking about is that userspace config access,
> > e.g., via "lspci" or "setpci" may interfere with kernel usage of DOE.
> > I attached what I think are the relevant bits from the spec.
> >
> > It looks to me like config *reads* should not be a problem: A read of
> > Write Data Mailbox always returns 0 and looks innocuous. A userspace
> > read of Read Data Mailbox may return a DW of the data object, but it
> > doesn't advance the cursor, so it shouldn't interfere with a kernel
> > read.
> >
> > A write to Write Data Mailbox could obviously corrupt an object being
> > written to the device. A config write to Read Data Mailbox *does*
> > advance the cursor, so that would definitely interfere with a kernel
> > user.
> >
> > So I think we're really talking about an issue with "setpci" and I
> > don't expect "lspci" to be a problem. "setpci" is a valuable tool,
> > and the fact that it can hose your system is not really news. I don't
> > know how hard we should work to protect against that.
>
> Thanks for looking this up and letting us know.
>
> So this should be fine, reads are ok, it's not as crazy of a protocol
> design as Dan alluded to, so the kernel should be ok. No need to add
> additional "protection" here at all, if you run setpci from userspace,
> you get what you asked for :)
>

Circling back to this after thinking of the implications and looking
at the review of the DOE code, this situation is different than your
typical "userspace gets to keep the pieces if it does a configuration
write". If we assume well behaved non-malicious userpace, it has no
reason to muck with critical config registers. If userspace changes a
BAR value and the system fails, yup, that's its own fault. The DOE
mailbox is different. There are legitimate reasons why non-broken
userspace would want to read some DOE payloads while the kernel is
retrieving its payloads. It also simplifies the kernel implementation
if it does need to worry about other agents interrupting its
transfers. My mistake was making this restriction apply to reads, but
I'm not on the same page that blocking writes is fruitless just
because userspace can do other damage with config writes.

So either the kernel DOE driver needs to use pci_cfg_access_lock() and
revalidate the state of the DOE after acquiring that lock, or it needs
to claim the interface completely and provide a driver for userspace
to submit requests that can be scheduled in the kernel's DOE queue.