[PATCHSET #master] sysfs: make sysfs disconnect immediately from kobject on deletion

From: Tejun Heo
Date: Sat Apr 07 2007 - 04:29:09 EST


Hello, all.

This patchset is result of the following thread.

http://thread.gmane.org/gmane.linux.kernel/510293

This patchset takes sysfs out of device driver lifetime equation which
not not only fixes several race conditions caused by sysfs not holding
onto the proper module when referencing kobject, but also helps fixing
and simplifying lifetime management in driver model.

sysfs is peculiar in how it's intertwined with driver model via
kobject and fs layer by using dentry to record some of its hierarchy.
This not only complicates lifetime management outside of sysfs but
also inside sysfs proper. We end up with several different yet
inter-dependent lifetime rules.

For example, dentry depends on sysfs_dirent for file accesses as any
dentry would depend on inode and its backing fs private data to do so,
while sysfs_dirent depends on dentry to walk sysfs_dirent tree for
internal purpose (symlink walk) and the initial access to the dentry
happens by going through kobject pointer. This interdependcy is okay
while all the objects are on memory but hell breaks loose when it's
time to kill those objects. dentry and sysfs_dirent depend on each
other. Unless they go away at the same time or use some way to safely
break the loop, one side ends up with dangled pointer to the other.

This patchset solves this by making sysfs_dirent behave more like fs
internal inodes in other filesystems which don't depend on dentry or
other external entity to manage itself. Most information is already
there. Only sd->s_parent and s_name are added. These do increase the
size of sysfs_dirent a bit but makes the logic look designed more in
Earth instead of Mars and with further changes, dentry and inode for
kobject can be made reclaimable which can probably compensate the
added space overhead.

Sysfs lifetime rules are much simpler now. sd denotes sysfs_dirent.

* sd has default reference of 1 on creation which is dropped on
deletion.

* dentry holds reference to sd. dentry->d_fsdata can be safely
dereferenced while referecne to dentry is held.

* sd->s_parent points to the parent sd and each child holds a
reference to its parent which is released when the child is released
(reference reaches zero), so sd->s_parent can be dereferenced
recursively if reference to the sd is held.

* sd->s_name can always be read while sd is valid.

* sd->s_elem.dir.kobj should only be accessed while
sd->s_elem.dir.rwsem is read locked, which can be done by calling
sysfs_get_dir_kobj() on the sd.

* sd->s_elem.[bin_]attr.[bin_attr] should only be accessed while its
parent's sd->s_elem.dir.rwsem is read locked. If
sysfs_get_dir_kobj() returns NULL, attr pointer might point to
released area.

* sysfs doesn't reference foreign objects for internal purpose.
Foreign objets are accessed from the callbacks or interface
functions where the caller is responsible for guaranteeing
accessbility - symlink interface function is currently an exception.
sysfs should export sysfs_dirent based interface and kobject code
should do the locking.

* Directory dentries are still pinned as they are used in interface
function - this should change in the future.

This patchset is consisted of the following 14 patches.

#01 : fix i_no handling bug and reduce dependency to inode
#02 : fix error handling in binattr write
#03-05 : prep for symlink reimplementation
#06-07 : add s_parent and s_name
#08 : make s_elem a union so that chaning what it points to doesn't
cause chaos and s_elem can contain more than one pointer
#09 : implement kobj_sysfs_assoc_lock to protect kobj->s_dentry
which will be used to keep symlink interface compatible
#10 : reimplement symlink using only sysfs_dirent tree
#11 : implement bin_buffer for immediate-kobject-disconnect
#12 : implement immediate-kobject-disconnect
#13-14 : kill now obsolete stuff

The first 11 are some fixes and preparation for
immediate-kobject-disconnect. Depencies to external objects are
gradually removed such that only accesses during file ops remain which
are converted by patch #12. The last two patches remove now
unnecessary attribute orphaning and attribute->owner.

I've run the following test on a UP machine for several hours without
any oops or memory leak and the test is currently running on a dual
processor (hyperthreading) machine for about half an hour. I'll keep
it running for at least 5 hours.

# (cd kernel-build-dir; while true; do echo [Loading...]; insmod drivers/scsi/scsi_mod.ko; insmod drivers/scsi/sd_mod.ko; insmod drivers/ata/libata.ko; insmod drivers/ata/ahci.ko; sleep 1; echo [Unloading...]; while lsmod | grep -q sd_mod; do rmmod ahci; rmmod libata; rmmod sd_mod; rmmod scsi_mod; sleep .1; done; done) &
# (cd /sys; while true; do ls -liR > /dev/null; done) &
# (cd /sys; while true; do find . | xargs cat > /dev/null 2>&1; done) &
# (cd /sys; while true; do find . | sort | xargs cat > /dev/null 2>&1; done) &

Three harddisks and a dvd writer are attached to on-board ahci
controller, so there are plenty of sysfs activities going on involving
symlinks and all.

Thanks.

--
tejun

drivers/base/class.c | 2 -
drivers/base/core.c | 4 -
drivers/base/firmware_class.c | 2 +-
drivers/block/pktcdvd.c | 3 +-
drivers/char/ipmi/ipmi_msghandler.c | 10 -
drivers/cpufreq/cpufreq_stats.c | 3 +-
drivers/cpufreq/cpufreq_userspace.c | 2 +-
drivers/cpufreq/freq_table.c | 1 -
drivers/firmware/dcdbas.h | 3 +-
drivers/firmware/dell_rbu.c | 6 +-
drivers/firmware/edd.c | 2 +-
drivers/firmware/efivars.c | 6 +-
drivers/i2c/chips/eeprom.c | 1 -
drivers/i2c/chips/max6875.c | 1 -
drivers/infiniband/core/sysfs.c | 1 -
drivers/input/mouse/psmouse.h | 1 -
drivers/media/video/pvrusb2/pvrusb2-sysfs.c | 13 --
drivers/misc/asus-laptop.c | 3 +-
drivers/pci/hotplug/acpiphp_ibm.c | 1 -
drivers/pci/pci-sysfs.c | 4 -
drivers/pcmcia/socket_sysfs.c | 2 +-
drivers/rtc/rtc-ds1553.c | 1 -
drivers/rtc/rtc-ds1742.c | 1 -
drivers/scsi/arcmsr/arcmsr_attr.c | 3 -
drivers/scsi/lpfc/lpfc_attr.c | 2 -
drivers/scsi/qla2xxx/qla_attr.c | 6 -
drivers/spi/at25.c | 1 -
drivers/video/aty/radeon_base.c | 2 -
drivers/video/backlight/backlight.c | 2 +-
drivers/video/backlight/lcd.c | 2 +-
drivers/w1/slaves/w1_ds2433.c | 1 -
drivers/w1/slaves/w1_therm.c | 1 -
drivers/w1/w1.c | 2 -
fs/ecryptfs/main.c | 2 -
fs/ocfs2/cluster/masklog.c | 1 -
fs/partitions/check.c | 1 -
fs/sysfs/bin.c | 189 ++++++++++++-------
fs/sysfs/dir.c | 270 ++++++++++++++++-----------
fs/sysfs/file.c | 206 ++++++++++-----------
fs/sysfs/inode.c | 61 +------
fs/sysfs/mount.c | 10 +-
fs/sysfs/symlink.c | 121 ++++++-------
fs/sysfs/sysfs.h | 109 ++++-------
include/linux/sysdev.h | 3 +-
include/linux/sysfs.h | 8 +-
kernel/module.c | 9 +-
kernel/params.c | 1 -
net/bridge/br_sysfs_br.c | 3 +-
net/bridge/br_sysfs_if.c | 3 +-
49 files changed, 496 insertions(+), 596 deletions(-)


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/