Re: [PATCH 1/2] sysfs: Fail bin file mmap if vma close is implemented.

From: Eric W. Biederman
Date: Wed Sep 22 2010 - 17:01:51 EST


Hugh Dickins <hughd@xxxxxxxxxx> writes:

> On Mon, 20 Sep 2010, Eric W. Biederman wrote:
>>
>> It is not reasonably possible to wrap vma->close(). To correctly
>> wrap close would imply calling close on any vmas that remain when
>> sysfs_remove_bin_file is called. Finding the proper lists walking
>> them getting the locking right etc, requires deep knowledge of the
>> mm subsystem and as such would require assistence from the mm
>> subsystem to implement. That assistence does not currently exist.
>
> Isn't it fair to assume that the call to sysfs_remove_bin_file() would
> come from the module (or whatever) that would be handling the ->mmap,
> ->open, ->close, ->fault etc. calls?

It is.

> In which case, it has been kept up to date with the reference counting
> so far: and if it judges that it's apppropriate now to remove bin file,
> despite having received more mmaps and opens than closes, then I think
> it should be allowed to make that decision (knowing that your sysfs end
> will take care not to call it further), even though it wanted to support
> close while the sysfs bin file was still there.

sysfs is that reference counting layer and removes the burden of that
decision from the drivers.

There are two cases this can happen that are interesting: on rmmod of
the driver, and when the underlying hardware for the driver is removed,
thing usb or pci hotplug. In those cases the driver really doesn't have
much choice but to clean as best it can, blocking and waiting for
references counts to go to zero is not a good option.

> I think it would be nicer to leave the bin_vma_close() handling as you
> had it before (but repositioning its sysfs_get_active more carefully,
> as in your 2/2).

I definitely agree that allowing an implementing that implements close
would definitely be a good thing. Especially as 44 out of 99
vm_operations_structs implement close. To implement close properly
requires finding all of the vmas and calling the underlying close
method, and for that I need support from the mm layer which I don't yet
have. Close is typically used to either free per vma state, or to
decrement a reference count that open creates, so having mismatched
open and closes really is a problem.

In practice there are only two files that implement the mmap method
of sysfs bin files.

drivers/pci/pci-sysfs.c
arch/alpha/kernel/pci-sysfs.c

Which in their own way implement mmap for the pci memory mapped resource
bars of given platform. Each arch implements
pci_mmap_legacy_page_range and pci_mmap_resource in their own way. The
few I have traced come down to io_remap_pfn_range which boils down to
remap_pfn_range. remap_pfn_range doesn't even install
vm_operations_struct so in practice none of the sysfs vma methods get
exercised.

I complain in mmap in case some arch or some implementation is doing
something strange that I missed, because I really do want to be as
general as possible.

I do have a generic version of this brewing that I hope to replace the
sysfs version with that modifies the mm layer with the support I need,
and I will copy you on that when I post it. sysfs is not quite perfect
but proc is actively broken in this area, and I don't know how many
other places in the kernel need fixes.

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/