Re: [PATCH] proc: get a reference to the owning module when opening file

From: Al Viro
Date: Tue Feb 17 2015 - 15:15:08 EST

On Wed, Feb 11, 2015 at 11:05:07AM +0800, Nicolas Iooss wrote:
> When a module creates a file in procfs and a program uses the file with
> mmap, the .owner field of the file_operations structure is ignored.
> This allows removing the module while the file is still being used.
> Therefore this sequence of events leads to a kernel oops:
> * load a module which creates a file in procfs with an mmap operation
> * open the file
> * use mmap on it
> * rmmod the module
> * call unmap
> Here are parts of the oops caused by unmap:
> [ 1234.337725] BUG: unable to handle kernel paging request at ffffffffa030a268
> [ 1234.338007] IP: [<ffffffff811bf054>] remove_vma+0x24/0x70
> [ 1234.338007] PGD 1c17067 PUD 1c18063 PMD 3d713067 PTE 0
> [ 1234.338007] Oops: 0000 [#1] SMP
> [ 1234.338007] Modules linked in: fuse rpcsec_gss_krb5 nfsv4
> dns_resolver nfs fscache cfg80211 rfkill ppdev bochs_drm virtio_net
> serio_raw parport_pc ttm pvpanic parport drm_kms_helper drm i2c_piix4
> nfsd auth_rpcgss nfs_acl lockd sunrpc virtio_blk virtio_pci virtio_ring
> virtio ata_generic pata_acpi [last unloaded: procfs_mmap]
> [ 1234.338007] Call Trace:
> [ 1234.338007] [<ffffffff811c155f>] do_munmap+0x27f/0x3b0
> [ 1234.338007] [<ffffffff811c16d1>] vm_munmap+0x41/0x60
> [ 1234.338007] [<ffffffff811c2652>] SyS_munmap+0x22/0x30
> [ 1234.338007] [<ffffffff817301a9>] system_call_fastpath+0x16/0x1b
> Fix this by making proc_reg_open grab a reference to the module owning
> pde->proc_fops.


procfs doesn't pin modules while the file is opened, by design. And
it's not about to start. Background:

* ->owner protects the wrong thing - it protects _code_ in methods, but
not the data structures. Preventing rmmod might act as a proxy in some
cases, but only when objects are _not_ dynamically created/removed.

* what we want is to make sure that no new method calls might be issued
after procfs removal and that procfs removal will wait for method
calls already in progress to complete. Again, the only way it's relevant
to rmmod is that procfs removal is often triggered by module_exit.
BTW, debugfs is seriously broken in that respect - it does protect the
module, all right, but have an object removed earlier and you are fucked.

* mmap is a big can of worms in that respect, mostly in terms of desired
semantics. What should be done to existing mappings when object is removed?
Sure, no new ones should be allowed, that much is obvious. But what should
happen to VMAs already there and what should happen to pages covered by
those? IMO the effect of truncate() on shared file mappings is the best
approximation for what we want there, but there's a considerable amount of
PITA in the vm_operations_struct that needs to be sorted out first. In
particular, ->open() and ->close() are often badly racy *and* there are
all kinds of bogisities with code making assumptions about impossibility of
getting several VMAs over the same object (fork() isn't the only thing that
could lead to that; munmap() the middle will do it as well and VM_DONTCOPY
won't prevent the latter).

* as it is, mmap() in procfs has _very_ limited use - /proc/bus/pci/*/*
is all there, it's never modular and it never uses any data structures
past ->mmap() anyway. All pages are already mapped after that point,
->access() is the only possibly non-trivial method and even that won't
look at anything other than vma itself (no looking into ->vm_file, etc.).

Looks like you went and added another ->mmap() user. I'm not at all sure
that procfs or debugfs are good places for that, but that depends on the
details of what you are trying to do. _IF_ you really need it in procfs,
welcome to cleaning the vm_operations_struct out. Won't be fun - I had
taken stabs at that several times, never got more than a few isolated
patches out of any of those attempts... I have notes from the last such
work and if you are serious about digging through that crap, I can exhumate
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at