Re: [PATCH 2/9] IB: add a proper completion queue abstraction

From: Bart Van Assche
Date: Wed Nov 18 2015 - 13:20:27 EST

On 11/17/2015 11:55 PM, Sagi Grimberg wrote:
+static void ib_cq_poll_work(struct work_struct *work)
+ struct ib_cq *cq = container_of(work, struct ib_cq, work);
+ int completed;
+ completed = __ib_process_cq(cq, IB_POLL_BUDGET_WORKQUEUE);
+ if (completed >= IB_POLL_BUDGET_WORKQUEUE ||
+ ib_req_notify_cq(cq, IB_POLL_FLAGS) > 0)
+ queue_work(ib_comp_wq, &cq->work);
+static void ib_cq_completion_workqueue(struct ib_cq *cq, void *private)
+ queue_work(ib_comp_wq, &cq->work);

The above code will cause all polling to occur on the context of the CPU
that received the completion interrupt. This approach is not powerful
enough. For certain workloads throughput is higher if work completions
are processed by another CPU core on the same CPU socket. Has it been
considered to make the CPU core on which work completions are processed
configurable ?

The workqueue is unbound. This means that the functionality you are
you are asking for exists.

Hello Sagi,

Are you perhaps referring to the sysfs CPU mask that allows to control workqueue affinity ? I expect that setting the CPU mask for an entire pool through sysfs will lead to suboptimal results. What I have learned by tuning target systems is that there is a significant performance difference (> 30% IOPS) between a configuration where each completion thread is pinned to exactly one CPU compared to allowing the scheduler to choose a CPU.

Controlling the CPU affinity of worker threads with the taskset command is not possible since the function create_worker() in kernel/workqueue.c calls kthread_bind_mask(). That function sets PF_NO_SETAFFINITY. From sched.h:

#define PF_NO_SETAFFINITY 0x04000000 /* Userland is not allowed to meddle with cpus_allowed */

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at