Re: [PATCH v2 2/2] debugfs: prevent access to removed files' private data
From: Nicolai Stange
Date: Tue Feb 09 2016 - 17:03:58 EST
Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> writes:
> On Mon, Feb 08, 2016 at 09:00:05PM +0100, Nicolai Stange wrote:
>> Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> writes:
>>
>> > On Mon, Feb 08, 2016 at 06:14:58PM +0100, Nicolai Stange wrote:
>> >> Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> writes:
>> >>
>> >> > On Mon, Feb 08, 2016 at 04:03:27PM +0100, Nicolai Stange wrote:
>> >> >> Upon return of debugfs_remove()/debugfs_remove_recursive(), it might
>> >> >> still be attempted to access associated private file data through
>> >> >> previously opened struct file objects. If that data has been freed by
>> >> >> the caller of debugfs_remove*() in the meanwhile, the reading/writing
>> >> >> process would either encounter a fault or, if the memory address in
>> >> >> question has been reassigned again, unrelated data structures could get
>> >> >> overwritten.
>> >> >>
>> >> >> However, since debugfs files are seldomly removed, usually from module
>> >> >> exit handlers only, the impact is very low.
>> >> >>
>> >> >> Since debugfs_remove() and debugfs_remove_recursive() are already
>> >> >> waiting for a SRCU grace period before returning to their callers,
>> >> >> enclosing the access to private file data from ->read() and ->write()
>> >> >> within a SRCU read-side critical section does the trick:
>> >> >> - Introduce the debugfs_file_use_data_start() and
>> >> >> debugfs_file_use_data_finish() helpers which just enter and leave
>> >> >> a SRCU read-side critical section. The former also reports whether the
>> >> >> file is still alive, that is if d_delete() has _not_ been called on
>> >> >> the corresponding dentry.
>> >> >> - Introduce the DEFINE_DEBUGFS_ATTRIBUTE() macro which is completely
>> >> >> equivalent to the DEFINE_SIMPLE_ATTRIBUTE() macro except that
>> >> >> ->read() and ->write are set to SRCU protecting wrappers around the
>> >> >> original simple_read() and simple_write() helpers.
>> >> >> - Use that DEFINE_DEBUGFS_ATTRIBUTE() macro for all debugfs_create_*()
>> >> >> attribute creation variants where appropriate.
>> >> >> - Manually introduce SRCU protection to the debugfs-predefined readers
>> >> >> and writers not covered by the above DEFINE_SIMPLE_ATTRIBUTE()->
>> >> >> DEFINE_DEBUGFS_ATTRIBUTE() replacement.
>> >> >>
>> >> >> Finally, it should be worth to note that in the vast majority of cases
>> >> >> where debugfs users are handing in a "custom" struct file_operations
>> >> >> object to debugfs_create_file(), an attribute's associated data's
>> >> >> lifetime is bound to the one of the containing module and thus,
>> >> >> taking a reference on ->owner during file opening acts as a proxy here.
>> >> >> There is no need to do a mass replace of DEFINE_SIMPLE_ATTRIBUTE() to
>> >> >> DEFINE_DEBUGFS_ATTRIBUTE() outside of debugfs.
>> >> >>
>> >> >> OTOH, new users of debugfs are encouraged to prefer the
>> >> >> DEFINE_DEBUGFS_ATTRIBUTE() macro over DEFINE_SIMPLE_ATTRIBUTE() and it,
>> >> >> as well as the needed read/write wrappers are made available globally.
>> >> >> For new users implementing their own readers and writers, the lifetime
>> >> >> management helpers debugfs_file_use_data_start() and
>> >> >> debugfs_file_use_data_finish() are exported.
>> >> >
>> >> > Nice job. One more request... :)
>> >> >
>> >> > Can you show how you would convert a subsystem to use these new
>> >> > macros/calls to give a solid example of it in use outside of the debugfs
>> >> > core?
>> >>
>> >> You mean in the form of a patch [3/3] for an arbitrary subsystem other
>> >> than debugfs? Or in the form of an update of
>> >> Documentation/filesystems/debugfs.txt?
>> >
>> > For an arbritary subsystem would be great. Showing how this should be
>> > used / converted tree-wide.
>> >
>> >> In case you want to have a patch: for the DEFINE_DEBUGFS_ATTRIBUTE, I
>> >> could simply abuse
>> >> drivers/staging/android/ion/ion.c
>> >> as it has got a DEFINE_SIMPLE_ATTRIBUTE debug_shrink_fops passed to
>> >> debugfs. In this particular case, it even looks like that this debugfs
>> >> file can be removed through ion_client_destroy() without any module
>> >> removal. Fixing this would be as easy as
>> >> s/DEFINE_SIMPLE_ATTRIBUTE/DEFINE_DEBUGFS_ATTRIBUTE/.
>> >
>> > Great, why wouldn't we do that for all users of debugfs that have this
>> > type of interaction with it?
>>
>> So this is a "yes", I should include these kind of fixes within this
>> series as [3/X], [4/X], ..., [X/X]?
>
> Yes please.
>
>> Last time I checked the tree (Nov.), there weren't any users of this
>> kind (debugfs file removal w/o module unload).
>> Obviously I missed ion though... I will recheck.
Uhm, I must have been drinking back then.
The real statistics are:
There are 101 DEFINE_SIMPE_ATTRIBUTE fops handed to debugfs.
>From these, 49 suffer from "dynamic" debugfs file removal races.
These 49 are distributed over the following 15 files:
drivers/gpu/drm/i915/i915_debugfs.c
drivers/hwmon/asus_atk0110.c
drivers/iio/gyro/adis16136.c
drivers/iio/imu/adis16400_core.c
drivers/iio/imu/adis16480.c
drivers/input/touchscreen/edt-ft5x06.c
drivers/misc/cxl/debugfs.c
drivers/mmc/core/debugfs.c
drivers/net/wimax/i2400m/debugfs.c
drivers/net/wireless/ath/wil6210/debugfs.c
drivers/net/wireless/mac80211_hwsim.c
fs/ceph/debugfs.c
fs/ocfs2/blockcheck.c
lib/fault-inject.c
net/bluetooth/hci_debugfs.c
Details of my manual analysis can be found at
http://nicst.de/debugfs_dsa_users.html
Now, a quick grep shows that there are 1002 call sites of
debugfs_create_file() in total. Minus 101 yields 901 users of debugfs
who hand in custom file_operations (assuming a 1:1 relationship between
DEFINE_SIMPLE_ATTRIBUTE() and debugfs_create_file(), which mostly holds).
Another 342 of these custom file_operations hand-ins are of the seqfile
style, i.e. have their ->read set to seq_read().
(A coccinelle patch for finding these 342 is available at
http://nicst.de/debugfs_seq.cocci
Run spatch with '-D org' or '-D report')
Migrating those to a debugfs_create_seqfile() might be good in terms of
both, avoiding the removal race and refactorization in general.
If only they were not so many...
However, there are still 901-342=559 debugfs_create_file() invocations I
haven't even looked at yet. No clue how safe these are against removal
races.
Before I proceed with the current approach, let me note that there might
be an alternative that would not require modification of each and every
debugfs user:
We could rather not replace the ->f_op in debugfs' proxy_open() (from
this series) with the original one, but instead keep an "extended" proxy
there forever.
This extended proxy would simply proxy every file_operations method to
the original one, under protection of the SRCU lock. However, the
question is what to do with file_operation methods that are NULL in the
original. Probably, -ENOTSUPP most of the time.
Now it's up to you to decide which way to go: mass patching throughout
the tree or the full proxy?
>>
>> >
>> >> Regarding a use case with custom made file_operations whose
>> >> reader and writer are protected by the debugfs_file_use_data_*()
>> >> helpers, I'm a little bit at a loss with: ion.c has got its custom
>> >> 'debug_heap_fops', but in this case, it would probably be more
>> >> appropriate to create a general debugfs_create_seqfile() centrally in
>> >> debugfs.
>> >
>> > ion is 'rough', but if enough people use seqfile in debugfs, yes, we
>> > should provide a generic interface for it to make it easier to use so
>> > they don't have to roll their own, and so they get the fixes you did
>> > here for their code as well.
>>
>> A quick check revealed that there are *many* such seqfile users.
>>
>> Since these would all get touched, I think it is better to postpone the
>> introduction of a debugfs_create_seqfile() to another series dedicated
>> to that?
>
> Yes that would be a good idea.
>
> thanks,
>
> greg k-h