RE: [Patch v4 2/3] Drivers: hv: add Azure Blob driver

From: Long Li
Date: Tue Jul 20 2021 - 15:59:19 EST


> Subject: Re: [Patch v4 2/3] Drivers: hv: add Azure Blob driver
>
> On Mon, Jul 19, 2021 at 08:31:05PM -0700, longli@xxxxxxxxxxxxxxxxx wrote:
> > +struct az_blob_device {
> > + struct hv_device *device;
> > +
> > + /* Opened files maintained by this device */
> > + struct list_head file_list;
> > + /* Lock for protecting file_list */
> > + spinlock_t file_lock;
> > +
> > + /* The refcount for this device */
> > + refcount_t count;
>
> Just use a kref please if you really need this. Are you sure you do?
> You already have 2 other reference counted objects being used here, why make
> it 3?

The "count" is to keep track how many user-mode instances and vmbus instance
are opened on this device. Being a VMBUS device, this device can be removed
at any time (host servicing etc). We must remove the device when this happens
even if the device is still opened by some user-mode program. The "count" will
guarantee the lifecycle of the device object after all user-mode has released the device.

I looked at using "file_list" (it's used for tracking opened files by user-mode) for this purpose,
but I found out I still need to manage the device count at the vmbus side.

>
> > + /* Pending requests to VSP */
> > + atomic_t pending;
>
> Why does this need to be atomic?

"pending' is per-device maintained that could change when multiple-user access
the device at the same time.

>
>
> > + wait_queue_head_t waiting_to_drain;
> > +
> > + bool removing;
>
> Are you sure this actually works properly? Why is it needed vs. any other misc
> device?

When removing this device from vmbus, we need to guarantee there is no possible packets to
vmbus. This is a requirement before calling vmbus_close(). Other drivers of vmbus follows
the same procedure.

The reason why this driver needs this is that the device removal can happen in the middle of
az_blob_ioctl_user_request(), which can send packet over vmbus.

>
>
> > +/* VSC->VSP request */
> > +struct az_blob_vsp_request {
> > + u32 version;
> > + u32 timeout_ms;
> > + u32 data_buffer_offset;
> > + u32 data_buffer_length;
> > + u32 data_buffer_valid;
> > + u32 operation_type;
> > + u32 request_buffer_offset;
> > + u32 request_buffer_length;
> > + u32 response_buffer_offset;
> > + u32 response_buffer_length;
> > + guid_t transaction_id;
> > +} __packed;
>
> Why packed? If this is going across the wire somewhere, you need to specify
> the endian-ness of these values, right? If this is not going across the wire, no
> need for it to be packed.

Those data go through the wire.

All data structures specified in the Hyper-V and guest VM use Little Endian byte
ordering. All HV core drivers have a dependence on X86, that guarantees this
ordering.

>
> > +
> > +/* VSP->VSC response */
> > +struct az_blob_vsp_response {
> > + u32 length;
> > + u32 error;
> > + u32 response_len;
> > +} __packed;
>
> Same here.
>
> > +
> > +struct az_blob_vsp_request_ctx {
> > + struct list_head list;
> > + struct completion wait_vsp;
> > + struct az_blob_request_sync *request; };
> > +
> > +struct az_blob_file_ctx {
> > + struct list_head list;
> > +
> > + /* List of pending requests to VSP */
> > + struct list_head vsp_pending_requests;
> > + /* Lock for protecting vsp_pending_requests */
> > + spinlock_t vsp_pending_lock;
> > + wait_queue_head_t wait_vsp_pending;
> > +
> > + pid_t pid;
>
> Why do you need a pid? What namespace is this pid in?

It's a request from user library team for production troubleshooting
purposes. It's exposed as informal in debugfs.

>
> > +static int az_blob_probe(struct hv_device *device,
> > + const struct hv_vmbus_device_id *dev_id) {
> > + int ret;
> > + struct az_blob_device *dev;
> > +
> > + dev = kzalloc(sizeof(*dev), GFP_KERNEL);
> > + if (!dev)
> > + return -ENOMEM;
> > +
> > + spin_lock_init(&dev->file_lock);
> > + INIT_LIST_HEAD(&dev->file_list);
> > + atomic_set(&dev->pending, 0);
> > + init_waitqueue_head(&dev->waiting_to_drain);
> > +
> > + ret = az_blob_connect_to_vsp(device, dev, AZ_BLOB_RING_SIZE);
> > + if (ret)
> > + goto fail;
> > +
> > + refcount_set(&dev->count, 1);
> > + az_blob_dev = dev;
> > +
> > + // create user-mode client library facing device
> > + ret = az_blob_create_device(dev);
> > + if (ret) {
> > + dev_err(AZ_DEV, "failed to create device ret=%d\n", ret);
> > + az_blob_remove_vmbus(device);
> > + goto fail;
> > + }
> > +
> > + dev_info(AZ_DEV, "successfully probed device\n");
>
> When drivers are working properly, they should be quiet.

The reason is that in production environment when dealing with custom support
cases, there is no good way to check if the channel is opened on the device. Having
this message will greatly clear confusions on possible mis-configurations.

>
> And what is with the AZ_DEV macro mess?

It's not required, it's just for saving code length. I can put "&az_blob_dev->device->device"
in every dev_err(), but it makes the code look a lot longer.

>
> And can you handle more than one device in the system at one time? I think
> your debugfs logic will get really confused.

There can be one device object active in the system at any given time. The debugfs grabs
the current active device object. If the device is being removed, removed or added,
the current active device object is updated accordingly.

>
> thanks,
>
> greg k-h