Re: [PATCH 1/3] vhost: replace vhost_workqueue with per-vhostkthread

From: Oleg Nesterov
Date: Mon May 31 2010 - 10:46:44 EST


On 05/30, Tejun Heo wrote:
>
> This conversion is to make each vhost use a dedicated kthread so that
> resource control via cgroup can be applied.

Personally, I agree. I think This is better than play with workqueue thread.

A couple of simple questions after the quick glance at the unapplied patch...

> void vhost_poll_flush(struct vhost_poll *poll)
> {
> - flush_work(&poll->work);
> + int seq = poll->queue_seq;
> +
> + if (seq - poll->done_seq > 0)
> + wait_event(poll->done, seq - poll->done_seq <= 0);

The check before wait_event() is not needed, please note that wait_event()
checks the condition before __wait_event().

What I can't understand is why we do have ->queue_seq and ->done_seq.

Isn't the single "bool poll->active" enough? vhost_poll_queue() sets
->active == T, vhost_poller() clears it before wake_up_all(poll->done).

> +static int vhost_poller(void *data)
> +{
> + struct vhost_dev *dev = data;
> + struct vhost_poll *poll;
> +
> +repeat:
> + set_current_state(TASK_INTERRUPTIBLE); /* mb paired w/ kthread_stop */

I don't understand the comment... why do we need this barrier?

> + if (kthread_should_stop()) {
> + __set_current_state(TASK_RUNNING);
> + return 0;
> + }
> +
> + poll = NULL;
> + spin_lock(&dev->poller_lock);
> + if (!list_empty(&dev->poll_list)) {
> + poll = list_first_entry(&dev->poll_list,
> + struct vhost_poll, node);
> + list_del_init(&poll->node);
> + }
> + spin_unlock(&dev->poller_lock);
> +
> + if (poll) {
> + __set_current_state(TASK_RUNNING);
> + poll->fn(poll);
> + smp_wmb(); /* paired with rmb in vhost_poll_flush() */
> + poll->done_seq = poll->queue_seq;
> + wake_up_all(&poll->done);
> + } else
> + schedule();
> +
> + goto repeat;
> +}

Given that vhost_poll_queue() does list_add() and wake_up_process() under
->poller_lock, I don't think we need any barriers to change ->state.

IOW, can't vhost_poller() simply do

while(!kthread_should_stop()) {

poll = NULL;
spin_lock(&dev->poller_lock);
if (!list_empty(&dev->poll_list)) {
...
} else
__set_current_state(TASK_INTERRUPTIBLE);
spin_unlock(&dev->poller_lock);

if (poll) {
...
} else
schedule();
}

?

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/