Re: Broken pci_block_user_cfg_access interface

From: Michael S. Tsirkin
Date: Thu Aug 25 2011 - 05:39:57 EST


On Thu, Aug 25, 2011 at 11:19:54AM +0200, Jan Kiszka wrote:
> On 2011-08-24 17:02, Brian King wrote:
> > On 08/24/2011 05:43 AM, Jan Kiszka wrote:
> >> Hi,
> >>
> >> trying to port the generic device interrupt masking pattern of
> >> uio_pci_generic to KVM's device assignment code, I stumbled over some
> >> fundamental problem with the current pci_block/unblock_user_cfg_access
> >> interface: it does not provide any synchronization between blocking
> >> sides. This allows user space to trigger a kernel BUG, just run two
> >>
> >> while true; do echo 1 > /sys/bus/pci/devices/<some-device>/reset; done
> >>
> >> loops in parallel and watch the kernel oops.
> >>
> >> Instead of some funky open-coded locking mechanism, we would rather need
> >> a plain mutex across both the user space access (via sysfs) and the
> >> sections guarded by pci_block/unblock_user_cfg_access so far. But I'm
> >> not sure which of them already allow sleeping, specifically if the IPR
> >> driver would be fine with such a change. Can someone in the CC list
> >> comment on this?
> >
> > The ipr driver calls pci_block/unblock_user_cfg_access from interrupt
> > context, so a mutex won't work.
>
> Ugh. What precisely does it have to do with the config space while
> running inside an IRQ handler (or holding a lock that synchronizes it
> with such a handler)?
>
> > When the pci_block/unblock API was
> > originally added, it did not have the checking it has today to detect
> > if it is being called nested. This was added some time later. The
>
> For a reason...
>
> > API that works best for the ipr driver is to allow for many block calls,
> > but a single unblock call unblocks access. It seems like what might
> > work well in the case above is a block count. Each call to pci_block
> > increments a count. Each pci_unblock decrements the count and only
> > actually do the unblock if the count drops to zero. It should be reasonably
> > simple for ipr to use that sort of an API as well.
>
> That will just paper over the underlying bug: multiple kernel users (!=
> sysfs access) fiddle with the config space in an unsynchronized fashion.
> Think of sysfs-triggered pci_reset_function while your ipr driver does
> its accesses.
>
> So it's pointless to tweak the current pci_block semantics, we rather
> need to establish a new mechanism that synchronizes *all* users of the
> config space.
>
> Jan

It does look like all of the problems are actually around reset.
So maybe all we need to do is synchronize the sysfs-triggered
pci_reset_function with pci_block/unblock_user_cfg_access?

In other words, when reset is triggered from sysfs, it
should obey pci_block/unblock_user_cfg_access
restrictions?

It does not look like reset needs to sleep, so fixing
that should not be hard, right?

>
> --
> Siemens AG, Corporate Technology, CT T DE IT 1
> Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/