Re: [PATCH 7/8] s390: new DCSS SHM device driver

From: Arnd Bergmann
Date: Thu Dec 30 2004 - 09:33:50 EST


On Dinsdag 28 Dezember 2004 09:28, Heiko Carstens wrote:
> [PATCH 7/8] s390: dcss shared memory.
>
> From: Carsten Otte <cotte@xxxxxxxxxx>
>
> Add support for shared memory using z/VM DCSS.

I'd rather not see this driver merged at this point. It's completely
lacking support for class devices, which means it will not work
with udev. Also, I'm still feeling the interface should much more
resemble Posix shared memory rather than 'use sysfs to create a
character device per shared memory segment'.

Ideally, we should have some user interface that also makes sense
for cross-guest shared memory on UML/Xen/etc.

Regarding the use of sysfs, I also think the memory segment
'devices' should share the name space with the ones used in dcssblk,
since they are actually the same resources, just used by different
device drivers. So instead of having

/sys/block/dcssblk/dcssblk0/device -> ../../../devices/dcssblk/foo
/sys/devices/dcssblk/foo/
/sys/devices/dcssshm/bar/

it rather should be

/sys/block/dcssblk/dcssblk0/device -> ../../../devices/dcss/foo
/sys/class/dcssshm/yourseg/device -> ../../../devices/dcss/bar
/sys/devices/dcss/foo/
/sys/devices/dcss/bar/

I'd really like to hear an outside opinion on this.

Since Carsten is currently on vacation, he won't be able to reply
too soon, so I suggest we postpone this for now.

Carsten, sorry I couldn't look over this before Heiko sent it out,
the next evening out is on me.

There are also a few smaller nits I'd like to pick:

> +#include <asm/ccwdev.h> // for s390_root_dev_(un)register()

This seems to be getting out of the original scope. I don't know
if anything better exists already, but I don't think registering
bus devices at the /sys/devices should depend on architecture
specific code. Either we agree that such a function is needed
for all architectures, or we should stop using it.

> +#define DCSSSHM_DEBUG /* Debug messages on/off */
> +#define DCSSSHM_NAME "dcssshm"
> +#ifdef DCSSSHM_DEBUG
> +#define PRINT_DEBUG(x...) printk(KERN_DEBUG DCSSSHM_NAME " debug: " x)
> +#else
> +#define PRINT_DEBUG(x...) do {} while (0)
> +#endif
> +#define PRINT_INFO(x...) printk(KERN_INFO DCSSSHM_NAME " info: " x)
> +#define PRINT_WARN(x...) printk(KERN_WARNING DCSSSHM_NAME " warning: " x)
> +#define PRINT_ERR(x...) printk(KERN_ERR DCSSSHM_NAME " error: " x)

IMHO, it would be better to use pr_info/pr_debug for new code.

> +
> +static ssize_t dcssshm_add_store(struct device * dev, const char * buf,
> + size_t count);
> +static ssize_t dcssshm_remove_store(struct device * dev, const char * buf,
> + size_t count);
> +static ssize_t dcssshm_save_store(struct device * dev, const char * buf,
> + size_t count);
> +static ssize_t dcssshm_save_show(struct device *dev, char *buf);
> +static ssize_t dcssshm_shared_store(struct device * dev, const char * buf,
> + size_t count);
> +static ssize_t dcssshm_shared_show(struct device *dev, char *buf);
> +
> +static int dcssshm_open(struct inode *inode, struct file *filp);
> +static int dcssshm_release(struct inode *inode, struct file *filp);
> +static int dcssshm_mmap(struct file * file, struct vm_area_struct * vma);
> +static struct page * dcssshm_nopage_in_place(struct vm_area_struct * area,
> + unsigned long address, int* type);
> +static loff_t dcssshm_llseek (struct file* file, loff_t offset, int orig);

If you move all functions into call graph order, you don't need any forward
declarations and at the same time it becomes obvious that there are no
direct recursions.

> +static DEVICE_ATTR(add, S_IWUSR, NULL, dcssshm_add_store);
> +static DEVICE_ATTR(remove, S_IWUSR, NULL, dcssshm_remove_store);

Maybe it's just me, but these edge-triggered event attributes in sysfs
feel wrong.

> +struct dcssshm_dev_info {
> + struct list_head lh;
> + struct device dev;
> + struct cdev *cdev;
> + char segment_name[BUS_ID_SIZE];
> + atomic_t use_count;
> + unsigned long start;
> + unsigned long end;
> + int segment_type;
> + unsigned char save_pending;
> + unsigned char is_shared;
> + unsigned char is_ro;
> + int minor;
> +};
It may be better to have two structures here, on that embeds the device
and one that embeds the cdev and class_device.

> +static struct vm_operations_struct dcssshm_vm_ops = {
> + .nopage = dcssshm_nopage_in_place,
> +};
> +
> +static struct file_operations dcssshm_fops =
> +{
Slightly inconsistent indentation here.

> +static struct list_head dcssshm_devices = LIST_HEAD_INIT(dcssshm_devices);
> +static struct rw_semaphore dcssshm_devices_sem;
static LIST_HEAD(dcssshm_devices);
static DECLARE_MUTEX(dcssshm_devices_sem);

> +/*
> + * release function for segment device.
> + */
> +static void
> +dcssshm_release_segment(struct device *dev)
> +{
> + PRINT_DEBUG("segment release fn called for %s\n", dev->bus_id);
> + kfree(container_of(dev, struct dcssshm_dev_info, dev));
> + module_put(THIS_MODULE);
> +}
AFAICS, there is still a tiny race against module unload here:
module_put(THIS_MODULE) is practically always a bug!

> +
> +/*
> + * get a minor number. needs to be called with
> + * down_write(&dcssshm_devices_sem) and the
> + * device needs to be enqueued before the semaphore is
> + * freed.
> + */
> +static inline int
> +dcssshm_assign_free_minor(struct dcssshm_dev_info *dev_info)
> +{
> + int minor, found;

This can probably be done much simpler using idr.

> + local_buf = kmalloc(count + 1, GFP_KERNEL);
> + if (local_buf == NULL) {
> + rc = -ENOMEM;
> + goto out_nobuf;
> + }

Why use kmalloc for this, when the buffer must not exceed 9 bytes?


> + if (imajor(filp->f_dentry->d_inode) != dcssshm_major)
> + return -ENODEV;

huh?

> +
> + minor = iminor(filp->f_dentry->d_inode);

iminor(inode) ?

> + down_write(&dcssshm_devices_sem);
> + dev_info = dcssshm_get_device_by_minor(minor);
> + if (!dev_info) {
> + rc = -ENODEV;
> + goto up_read;
> + }
> +
> +
> + filp->private_data = dev_info;
> + atomic_inc(&dev_info->use_count);

Why the extra use count?

> +
> + rc = 0;
> +up_read:
> + up_write(&dcssshm_devices_sem);

Interesting label name 8-)

> +static loff_t
> +dcssshm_llseek (struct file* file, loff_t offset, int orig)
> +{
> + loff_t ret;
> + struct dcssshm_dev_info *dev_info = (struct dcssshm_dev_info*)
> + file->private_data;

Is seek actually useful if you don't have read/write?

> + rc = device_create_file(dcssshm_root_dev, &dev_attr_remove);
> + if (rc) {
> + PRINT_ERR("device_create_file(remove) failed!\n");
> + s390_root_dev_unregister(dcssshm_root_dev);
> + return rc;
> + }
> + rc = alloc_chrdev_region (&dev, 0, 256, "dcssshm");
> + if (rc) {
> + PRINT_ERR("alloc_chrdev_region falied!\n");
> + s390_root_dev_unregister(dcssshm_root_dev);
> + return rc;
> + }

Why the limit to 256 segments? The rest of the driver appears
to be written to avoid such limitations. Also, using goto for
error handling, like in the other functions, would make this more
readable.

Arnd <><


Attachment: pgp00000.pgp
Description: signature