Re: [PATCH] xen/pciback: Prevent NULL pointer dereference in quirks_show

From: Boris Ostrovsky
Date: Mon Dec 09 2019 - 16:18:46 EST

On 12/9/19 1:16 PM, Nuernberger, Stefan wrote:
On Fri, 2019-12-06 at 15:15 -0500, Boris Ostrovsky wrote:
On 12/6/19 1:09 PM, Nuernberger, Stefan wrote:
On Fri, 2019-12-06 at 10:11 -0500, Boris Ostrovsky wrote:
On 12/6/19 8:48 AM, Stefan Nuernberger wrote:
From: Uwe Dannowski <uwed@xxxxxxxxx>
 list_for_each_entry(cfg_entry, &dev_data-
config_fields, list) {
Couldn't you have the same race here?
Not quite the same, but it might not be entirely safe yet. The
'quirks_show' takes the 'device_ids_lock' and races with unbind /
'pcistub_device_release' "which takes device_lock mutex". So this
now be a UAF read access instead of a NULL pointer dereference.
Yes, that's what I meant (although I don't see much difference in
Well, the NULL ptr access causes an instant kernel panic whereas we
have not attributed crashes to the possible UAF read until now.

ÂWe have
not observed adversarial effects in our testing (compared to the
obvious issues with NULL pointer) but that's not a guarantee of

So should quirks_show actually be protected by pcistub_devices_lock
instead as are other functions that access dev_data? Does it need
locks in that case?
device_ids_lock protects device_ids list, which is not what you are
trying to access, so that doesn't look like right lock to hold. And
AFAICT pcistub_devices_lock is not held when device data is cleared
pcistub_device_release() (which I think is where we are racing).
Indeed. The xen_pcibk_quirks list does not have a separate lock to
protect it. It's either modified under 'pcistub_devices_lock', from
pcistub_remove(), or iterated over with the 'device_ids_lock' held in
quirks_show(). Also the quirks list is amended from
  -> xen_pcibk_config_init_dev()
   -> xen_pcibk_config_quirks_init()
without holding any lock at all. In fact the
pcistub_init_devices_late() and pcistub_seize() functions deliberately
release the pcistub_devices_lock before calling pcistub_init_device().
This looks broken.


The race is between
  -> pcistub_device_put()
   -> pcistub_device_release()
on one side and the quirks_show() on the other side. The problematic
quirk is freed from the xen_pcibk_quirks list in pcistub_remove() early
on under pcistub_devices_lock before the associated dev_data is freed
eventually. So switching from device_ids_lock to pcistub_devices_lock
in quirks_show() could be sufficient to always have valid dev_data for
all quirks in the list.

Yes, that should do it. (I missed xen_pcibk_config_quirk_release() call, which is why I wasn't sure pcistub_devices_lock is held where necessary).

There is also pcistub_put_pci_dev() possibly in the race, called from
xen_pcibk_remove_device(), or xen_pcibk_xenbus_remove(), or
pcistub_remove(). The pcistub_remove() call site is safe when we switch
to pcistub_devices_lock (same reasoning as above). For the others I
currently do not see when the quirks are ever freed?

I wonder whether we should call xen_pcibk_config_quirk_release() from pcistub_device_release() under pcistub_devices_lock.