Re: A race condition between debugfs and seq_file operation

From: Matthias Schiffer
Date: Wed Jun 10 2015 - 10:12:56 EST


On 06/10/2015 07:20 AM, gregkh@xxxxxxxxxxxxxxxxxxx wrote:
> On Wed, Jun 10, 2015 at 05:00:03AM +0000, Lisa Du wrote:
>>> -----Original Message-----
>>> From: gregkh@xxxxxxxxxxxxxxxxxxx [mailto:gregkh@xxxxxxxxxxxxxxxxxxx]
>>> Sent: 2015å6æ10æ 5:12
>>> To: Lisa Du
>>> Cc: linux-kernel@xxxxxxxxxxxxxxx
>>> Subject: Re: A race condition between debugfs and seq_file operation
>>>
>>> On Mon, Jun 08, 2015 at 04:28:10AM +0000, Lisa Du wrote:
>>>> Hi, All
>>>> Recently I met one race condition related to debugfs.
>>>>
>>>> Take an example from ion.c in kernel3.14:
>>>> static int ion_debug_client_open(struct inode *inode, struct file
>>>> *file) {
>>>> return single_open(file, ion_debug_client_show, inode->i_private); }
>>>>
>>>> static const struct file_operations debug_client_fops = {
>>>> .open = ion_debug_client_open,
>>>> .read = seq_read,
>>>> .llseek = seq_lseek,
>>>> .release = single_release,
>>>> };
>>>> client->debug_root = debugfs_create_file(client->display_name, 0664,
>>>> dev->clients_debug_root,
>>>> client, &debug_client_fops);
>>>>
>>>> I find during I read the debugfs node, driver can do
>>>> debugfs_remove_recursive(dentry); Is it expected?
>>>
>>> Yes. Well, not "expected", but a mess, yes.
>>>
>>> Removing debugfs files are known to have lots of races, this isn't the only
>>> one :(
>> Thanks for the reply!
>> Not sure if there is any plan to resolve such races in the future?
>
> Yes, I have "plans", but it's on my very long todo list behind lots of
> other things...
>
> If you want to look into it, please, that would be wonderful.
>
> thanks,
>
> greg k-h

I've stumbled across related issues a few days ago (mostly in network
drivers). What I've found out:

* I couldn't find any driver using device-specific debugfs files
removing them in a race-free way
* Userspace can make the race window arbitrarily large by opening a
debugfs file and reading from it later:

modprobe batman-adv
modprobe dummy
echo bat0 > /sys/class/net/dummy0/batman_adv/mesh_iface
(sleep 5; cat) < /sys/kernel/debug/batman_adv/bat0/originators &
echo none > /sys/class/net/dummy0/batman_adv/mesh_iface
# When the sleep finishs, batman-adv will read from a freed net_device

* There also seems to be a bug debugfs_remove_recursive hanging when
removing subdirectories with files that are still open:

modprobe mac80211_hwsim
# Or whatever phyX the hwsim PHY is
(sleep 5; cat) < \
/sys/kernel/debug/ieee80211/phy0/statistics/retry_count &
rmmod mac80211_hwsim
# Will hang in wiphy_unregister() until the sleep finishes,
# with RTNL held!

Is there a sane way to check from the read fops callback if the file has
been removed (and lock against removal while doing that)? The nice
debugfs_create_u32() etc. helpers are useless as well for dynamic files
at the moment as they can't be used without this race condition...

I'd also like to get this cleaned up as soon as possible as changes I
plan for batman-adv might make the issue more prominent there.

Matthias

Attachment: signature.asc
Description: OpenPGP digital signature