Re: [PATCH] driver core: get kobject ref when accessing dev_attrs

From: Greg KH
Date: Sat Jul 20 2024 - 01:18:07 EST


On Fri, Jul 19, 2024 at 11:55:13AM -0700, Keith Busch wrote:
> From: Keith Busch <kbusch@xxxxxxxxxx>
>
> Get a reference to the device's kobject while storing and showing device
> attributes so that the device is valid for the lifetime of the sysfs access.
> Without this, the device may be released and use-after-free will occur.
>
> This is an easy problem to recreate with pci switches. Basic topology on a my
> qemu test machine:
>
> -[0000:00]-+-00.0 Intel Corporation 82G33/G31/P35/P31 Express DRAM Controller
> +-01.0-[01-04]----00.0-[02-04]--+-00.0-[03]--
> \-01.0-[04]----00.0 Red Hat, Inc. Virtio block device
>
> Simultaneously remove devices 04:00.0 and 01:00.0 and you'll hit it:
>
> # echo 1 > /sys/bus/pci/devices/0000\:04\:00.0/remove &
> # echo 1 > /sys/bus/pci/devices/0000\:01\:00.0/remove

So you remove the parent before the child and also want to remove the
child at the same time? You are going to have bad problems here :)

> Oops: general protection fault, probably for non-canonical address 0x6b6b6b6b6b6b6b9b: 0000 [#1] PREEMPT SMP PTI
> CPU: 9 PID: 359 Comm: bash Not tainted 6.10.0-rc1-00183-gea4516b2b250 #256
> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.2-14-g1e1da7a96300-prebuilt.qemu.org 04/01/2014
> RIP: 0010:pci_stop_bus_device+0x15/0x90
> Code: 00 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 41 54 55 48 89 fd 53 4c 8b 67 18 4d 85 e4 74 2d <49> 8b 7c 24 30 49 83 c4 28 4c 39 e7 74 1f 48 8b 5f 08 e8 d4 ff ff
> RSP: 0018:ffffc90000b67dd0 EFLAGS: 00010202
> RAX: 0000000000000000 RBX: ffff88800768c0c8 RCX: 0000000000000000
> RDX: 0000000000000000 RSI: ffffffff8365bc00 RDI: ffff88800768c000
> RBP: ffff88800768c000 R08: 0000000000000000 R09: 0000000000000000
> R10: ffffc90000b67df0 R11: 0000000000000003 R12: 6b6b6b6b6b6b6b6b
> R13: ffffc90000b67e90 R14: ffff8880075bccc8 R15: ffff88801082a620
> FS: 00007fe0c951e740(0000) GS:ffff88803ea80000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000560aaac4b030 CR3: 00000000108ae004 CR4: 0000000000770ef0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> PKRU: 55555554
> Call Trace:
> <TASK>
> ? die_addr+0x37/0x90
> ? exc_general_protection+0x1e5/0x430
> ? asm_exc_general_protection+0x26/0x30
> ? pci_stop_bus_device+0x15/0x90
> pci_stop_and_remove_bus_device_locked+0x1a/0x30
> remove_store+0x7d/0x90
> kernfs_fop_write_iter+0x13c/0x200
> vfs_write+0x359/0x510
> ksys_write+0x69/0xf0
> do_syscall_64+0x68/0x140
> entry_SYSCALL_64_after_hwframe+0x76/0x7e
>
> Signed-off-by: Keith Busch <kbusch@xxxxxxxxxx>
> ---
> drivers/base/core.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/drivers/base/core.c b/drivers/base/core.c
> index 131d96c6090be..108f2aa6eaaa9 100644
> --- a/drivers/base/core.c
> +++ b/drivers/base/core.c
> @@ -2433,12 +2433,15 @@ static ssize_t dev_attr_show(struct kobject *kobj, struct attribute *attr,
> struct device *dev = kobj_to_dev(kobj);
> ssize_t ret = -EIO;
>
> + if (!kobject_get_unless_zero(kobj))
> + return -ENXIO;

We've been down this path before, and it doesn't end well from what I
recall. Attributes that when written to remove themselves need to call
the correct function to do so (look at how scsi does it). I think this
change will now break that functionality. Look in the email archives a
long time ago for more details, I can't recall them at the moment,
sorry.

thanks,

greg k-h