Re: [PATCH] switchtec: cleanup cdev init

From: Logan Gunthorpe
Date: Sun Feb 19 2017 - 23:23:08 EST




On 19/02/17 02:43 PM, Dan Williams wrote:
> Is this race present for all file operations? I've only seen it with
> mmap() and late faults. So if these other drivers do not support mmap
> it's not clear they can trigger the failure.

I don't see how it's related to mmap only. I think there's a number of
variations on this but the race I see happens if you try to take a
reference to the device in the open/close handlers of a char device file.

If a driver puts the last remaining reference in the release handler,
then the device structure will be freed, and on returning from the
release handler, the char_dev code will try to put the last reference to
the cdev structure which may have already been free'd. There needs to be
a way for the cdev structure to take a reference on the device structure
so it doesn't get freed and the kobj.parent trick seems to accomplish
this fairly well.

Really, in any situation where there's a cdev and a device in the same
structure, the life cycles of the two become linked but their reference
counts are not and that is the problem here.

I feel like a number of developers have made the same realization
independently so this would probably be a good thing to clean up.

See these commits for examples spanning around 5 years:

4a215aa Input: fix use-after-free introduced with dynamic minor changes
0d5b7da iio: Prevent race between IIO chardev opening and IIO device
ba0ef85 tpm: Fix initialization of the cdev

Also, seems both my brother and I have independently looked into this
exact issue:

https://www.mail-archive.com/linux-rdma@xxxxxxxxxxxxxxx/msg25692.html

So, I think it would be a good idea to clean it up with Dan's proposed
API cleanup. If I have some time tomorrow, I may throw together a patch
set with that patch and the changes to the instances around the kernel.

Logan