[git patches] ocfs2 and configfs updates

From: Mark Fasheh
Date: Wed Jul 11 2007 - 12:23:26 EST


This represents the majority of our 2.6.23-rc1 patches.

I'll start with ocfs2, which gets the following new features:

* Full support for shared writeable mmap. This was implemented using the
->page_mkwrite callback to allow ocfs2 a chance to take cluster locks on
the inode data and allocate file space for writes. This also has the
effect that ocfs2 should never run out of space during ->writepage.

* Some internal improvements to how deallocation of meta-data is handled.
This simplified deallocation locking so that we could turn on allocation
of meta data from per-node pools.

* Unwritten extents support. ocfs2 can now pre-allocate inode space without
zeroing data by leaving a special "unwritten" flag in the extent record.
Reads from regions marked as unwritten return zeros and writes to the
regions are guaranteed to never require data allocation. The sparse file
support in Ocfs2 (from 2.6.22) included read support for unwritten
extents, so the file system option is an RO compat bit. It won't ever be
turned on by ocfs2-tools without first having the sparse files incompat
bit set.

* Support for removing arbitrary file regions. Conceptually this is the
opposite of unwritten extents support. The ocfs2 btree code is now smart
enough to remove any extent or partial extent, regardless of where it is
in the tree. Previously it was only possible to remove extents from the
rightmost edge during a truncate.

The reservation and extent truncate support is exposed to user space via a
set of ioctls compatible with those currently used by XFS. This allows us to
take advantage of the existing install base of software which uses that
feature. Some weeks ago, I also posted an -mm patch to get ocfs2 under the
->fallocate() callback. Once the sys_fallocate() patches are merged, I
intend to send a version of the ocfs2 patch upstream.


On the Configfs side, these patches include a series of cleanups and some
API adjustments.

* Drivers now have the ability to specify a dependancy on config items. This
allows clients to pin certain objects when removal would have bad
consequences. ocfs2 uses this functionality to codify a mountpoints
dependancy on a live heartbeat.

* A new notification callback, ops->disconnect_notify() is added which is
called just before item linkage is broken during removal. This allows
configfs users to do work on the item while it is still in the hierarchy.

Three of the configfs patches which modify the api touch code in fs/dlm (in
fairly trivial ways) so David has been CC'd. Those patches are also attached
to the end of this mail. I had to fix some conflicts with an fs/dlm/config.c
change when getting two of those ready for merge so the commit messages have
been marked as such.


Please pull from 'upstream-linus' branch of
git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2.git upstream-linus

to receive the following updates:

Documentation/filesystems/configfs/configfs.txt | 57
Documentation/filesystems/configfs/configfs_example.c | 2
fs/configfs/configfs_internal.h | 7
fs/configfs/dir.c | 289 +
fs/configfs/file.c | 28
fs/configfs/item.c | 29
fs/dlm/config.c | 20
fs/ocfs2/alloc.c | 2676 ++++++++++++++++--
fs/ocfs2/alloc.h | 43
fs/ocfs2/aops.c | 1015 ++++--
fs/ocfs2/aops.h | 61
fs/ocfs2/cluster/heartbeat.c | 96
fs/ocfs2/cluster/heartbeat.h | 6
fs/ocfs2/cluster/nodemanager.c | 42
fs/ocfs2/cluster/nodemanager.h | 5
fs/ocfs2/cluster/tcp.c | 21
fs/ocfs2/dir.c | 2
fs/ocfs2/dlm/dlmdomain.c | 8
fs/ocfs2/dlm/dlmmaster.c | 40
fs/ocfs2/dlm/dlmrecovery.c | 79
fs/ocfs2/dlmglue.c | 6
fs/ocfs2/endian.h | 5
fs/ocfs2/extent_map.c | 41
fs/ocfs2/file.c | 702 ++++
fs/ocfs2/file.h | 10
fs/ocfs2/heartbeat.c | 10
fs/ocfs2/ioctl.c | 15
fs/ocfs2/journal.c | 6
fs/ocfs2/journal.h | 2
fs/ocfs2/mmap.c | 167 -
fs/ocfs2/namei.c | 2
fs/ocfs2/ocfs2.h | 14
fs/ocfs2/ocfs2_fs.h | 33
fs/ocfs2/slot_map.c | 12
fs/ocfs2/suballoc.c | 46
fs/ocfs2/suballoc.h | 17
fs/ocfs2/super.c | 27
fs/ocfs2/super.h | 2
include/linux/configfs.h | 34
39 files changed, 4623 insertions(+), 1054 deletions(-)

Christoph Hellwig:
ocfs2: use list_for_each_entry where benefical

Eric Sandeen:
ocfs2: zero_user_page conversion

Joel Becker:
configfs: consistent attribute size
configfs: Convert subsystem semaphore to mutex
configfs: accessing item hierarchy during rmdir(2)
configfs: config item dependancies.
ocfs2: Depend on configfs heartbeat items.
ocfs2: live heartbeat depends on the local node configuration
ocfs2: Wake up a starting region if it gets killed in the background.

Johannes Berg:
configsfs buffer: use mutex

Mark Fasheh:
ocfs2: take ip_alloc_sem during entire truncate
ocfs2: rework ocfs2_buffered_write_cluster()
ocfs2: factor out write aops into nolock variants
ocfs2: shared writeable mmap
ocfs2: harden buffer check during mapping of page blocks
ocfs2: simplify deallocation locking
ocfs2: plug truncate into cached dealloc routines
ocfs2: use all extent block suballocators
ocfs2: abstract btree growing calls
ocfs2: btree changes for unwritten extents
ocfs2: small cleanup of ocfs2_write_begin_nolock()
ocfs2: support writing of unwritten extents
ocfs2: Support creation of unwritten extents
ocfs2: btree support for removal of arbirtrary extents
ocfs2: update truncate handling of partial clusters
ocfs2: support for removing file regions
ocfs2: Support xfs style space reservation ioctls

Satyam Sharma:
configfs: misc cleanups
configfs+dlm: Separate out __CONFIGFS_ATTR into configfs.h
configfs+dlm: Rename config_group_find_obj and state semantics clearly

Shani Moideen:
[KJ PATCH] Replacing memset(<addr>,0,PAGE_SIZE) with clear_page() in fs/ocfs2/dlm/dlmrecovery.c

Sunil Mushran:
ocfs2: Add "preferred slot" mount option


--
Mark Fasheh
Senior Software Developer, Oracle
mark.fasheh@xxxxxxxxxx



From: Satyam Sharma <ssatyam@xxxxxxxxxxxxxx>

[PATCH] configfs+dlm: Separate out __CONFIGFS_ATTR into configfs.h

fs/dlm/config.c contains a useful generic macro called __CONFIGFS_ATTR
that is similar to sysfs' __ATTR macro that makes defining attributes
easy for any user of configfs. Separate it out into configfs.h so that
other users (forthcoming in dynamic netconsole patchset) can use it too.

Signed-off-by: Satyam Sharma <ssatyam@xxxxxxxxxxxxxx>
Cc: David Teigland <teigland@xxxxxxxxxx>
Signed-off-by: Joel Becker <joel.becker@xxxxxxxxxx>
Signed-off-by: Mark Fasheh <mark.fasheh@xxxxxxxxxx>
---
fs/dlm/config.c | 8 --------
include/linux/configfs.h | 16 ++++++++++++++++
2 files changed, 16 insertions(+), 8 deletions(-)

diff --git a/fs/dlm/config.c b/fs/dlm/config.c
index 5069b2c..e47eb42 100644
--- a/fs/dlm/config.c
+++ b/fs/dlm/config.c
@@ -133,14 +133,6 @@ static ssize_t cluster_set(struct cluster *cl, unsigned int *cl_field,
return len;
}

-#define __CONFIGFS_ATTR(_name,_mode,_read,_write) { \
- .attr = { .ca_name = __stringify(_name), \
- .ca_mode = _mode, \
- .ca_owner = THIS_MODULE }, \
- .show = _read, \
- .store = _write, \
-}
-
#define CLUSTER_ATTR(name, check_zero) \
static ssize_t name##_write(struct cluster *cl, const char *buf, size_t len) \
{ \
diff --git a/include/linux/configfs.h b/include/linux/configfs.h
index 3d4a96e..def7c83 100644
--- a/include/linux/configfs.h
+++ b/include/linux/configfs.h
@@ -130,6 +130,22 @@ struct configfs_attribute {
mode_t ca_mode;
};

+/*
+ * Users often need to create attribute structures for their configurable
+ * attributes, containing a configfs_attribute member and function pointers
+ * for the show() and store() operations on that attribute. They can use
+ * this macro (similar to sysfs' __ATTR) to make defining attributes easier.
+ */
+#define __CONFIGFS_ATTR(_name, _mode, _show, _store) \
+{ \
+ .attr = { \
+ .ca_name = __stringify(_name), \
+ .ca_mode = _mode, \
+ .ca_owner = THIS_MODULE, \
+ }, \
+ .show = _show, \
+ .store = _store, \
+}

/*
* If allow_link() exists, the item can symlink(2) out to other
--
1.5.2.1




From: Satyam Sharma <ssatyam@xxxxxxxxxxxxxx>

[PATCH] configfs+dlm: Rename config_group_find_obj and state semantics clearly

Configfs being based upon sysfs code, config_group_find_obj() is probably
so named because of the similar kset_find_obj() in sysfs. However,
"kobject"s in sysfs become "config_item"s in configfs, so let's call it
config_group_find_item() instead, for sake of uniformity, and make
corresponding change in the users of this function.

BTW a crucial difference between kset_find_obj and config_group_find_item
is in locking expectations. kset_find_obj does its locking by itself, but
config_group_find_item expects the *caller* to do the locking. The reason
for this: kset's have their own locks, config_group's don't but instead
rely on the subsystem mutex. And, subsystem needn't necessarily be around
when config_group_find_item() is called.

So let's state these locking semantics explicitly, and rectify the comment,
otherwise bugs could continue to occur in future, as they did in the past
(refer commit d82b8191e238 in gfs2-2.6-fixes.git).

[ I also took the opportunity to fix some bad whitespace and
double-empty lines. --Joel ]

[ Conflict in fs/dlm/config.c with commit
3168b0780d06ace875696f8a648d04d6089654e5 manually resolved. --Mark ]

Signed-off-by: Satyam Sharma <ssatyam@xxxxxxxxxxxxxx>
Cc: David Teigland <teigland@xxxxxxxxxx>
Signed-off-by: Joel Becker <joel.becker@xxxxxxxxxx>
Signed-off-by: Mark Fasheh <mark.fasheh@xxxxxxxxxx>
---
fs/configfs/item.c | 18 ++++++++----------
fs/dlm/config.c | 2 +-
include/linux/configfs.h | 7 ++-----
3 files changed, 11 insertions(+), 16 deletions(-)

diff --git a/fs/configfs/item.c b/fs/configfs/item.c
index b762bbe..76dc4c3 100644
--- a/fs/configfs/item.c
+++ b/fs/configfs/item.c
@@ -183,27 +183,25 @@ void config_group_init(struct config_group *group)
INIT_LIST_HEAD(&group->cg_children);
}

-
/**
- * config_group_find_obj - search for item in group.
+ * config_group_find_item - search for item in group.
* @group: group we're looking in.
* @name: item's name.
*
- * Lock group via @group->cg_subsys, and iterate over @group->cg_list,
- * looking for a matching config_item. If matching item is found
- * take a reference and return the item.
+ * Iterate over @group->cg_list, looking for a matching config_item.
+ * If matching item is found take a reference and return the item.
+ * Caller must have locked group via @group->cg_subsys->su_mtx.
*/
-struct config_item *config_group_find_obj(struct config_group *group,
- const char * name)
+struct config_item *config_group_find_item(struct config_group *group,
+ const char *name)
{
struct list_head * entry;
struct config_item * ret = NULL;

- /* XXX LOCKING! */
list_for_each(entry,&group->cg_children) {
struct config_item * item = to_item(entry);
if (config_item_name(item) &&
- !strcmp(config_item_name(item), name)) {
+ !strcmp(config_item_name(item), name)) {
ret = config_item_get(item);
break;
}
@@ -215,4 +213,4 @@ EXPORT_SYMBOL(config_item_init);
EXPORT_SYMBOL(config_group_init);
EXPORT_SYMBOL(config_item_get);
EXPORT_SYMBOL(config_item_put);
-EXPORT_SYMBOL(config_group_find_obj);
+EXPORT_SYMBOL(config_group_find_item);
diff --git a/fs/dlm/config.c b/fs/dlm/config.c
index e47eb42..4348cb4 100644
--- a/fs/dlm/config.c
+++ b/fs/dlm/config.c
@@ -752,7 +752,7 @@ static struct space *get_space(char *name)
return NULL;

down(&space_list->cg_subsys->su_sem);
- i = config_group_find_obj(space_list, name);
+ i = config_group_find_item(space_list, name);
up(&space_list->cg_subsys->su_sem);

return to_space(i);
diff --git a/include/linux/configfs.h b/include/linux/configfs.h
index def7c83..bbb1b6c 100644
--- a/include/linux/configfs.h
+++ b/include/linux/configfs.h
@@ -86,12 +86,10 @@ struct config_item_type {
struct configfs_attribute **ct_attrs;
};

-
/**
* group - a group of config_items of a specific type, belonging
* to a specific subsystem.
*/
-
struct config_group {
struct config_item cg_item;
struct list_head cg_children;
@@ -99,13 +97,11 @@ struct config_group {
struct config_group **default_groups;
};

-
extern void config_group_init(struct config_group *group);
extern void config_group_init_type_name(struct config_group *group,
const char *name,
struct config_item_type *type);

-
static inline struct config_group *to_config_group(struct config_item *item)
{
return item ? container_of(item,struct config_group,cg_item) : NULL;
@@ -121,7 +117,8 @@ static inline void config_group_put(struct config_group *group)
config_item_put(&group->cg_item);
}

-extern struct config_item *config_group_find_obj(struct config_group *, const char *);
+extern struct config_item *config_group_find_item(struct config_group *,
+ const char *);


struct configfs_attribute {
--
1.5.2.1