Re: [RFC 0/2] target: Add TFO->complete_irq queue_work bypass

From: Sagi Grimberg
Date: Thu Jun 04 2015 - 13:01:14 EST


On 6/4/2015 10:06 AM, Nicholas A. Bellinger wrote:
On Wed, 2015-06-03 at 14:57 +0200, Christoph Hellwig wrote:
This makes lockdep very unhappy, rightly so. If you execute
one end_io function inside another you basÑcally nest every possible
lock taken in the I/O completion path. Also adding more work
to the hardirq path generally isn't a smart idea. Can you explain
what issues you were seeing and how much this helps? Note that
the workqueue usage in the target core so far is fairly basic, so
there should some low hanging fruit.

So I've been using tcm_loop + RAMDISK backends for prototyping, but this
patch is intended for vhost-scsi so it can avoid the unnecessary
queue_work() context switch within target_complete_cmd() for all backend
driver types.

This is because vhost_work_queue() is just updating vhost_dev->work_list
and immediately wake_up_process() into a different vhost_worker()
process context. For heavy small block workloads into fast IBLOCK
backends, avoiding this extra context switch should be a nice efficiency
win.

I can see that, did you get a chance to measure the expected latency
improvement?


Also, AFAIK RDMA fabrics are allowed to do ib_post_send() response
callbacks directly from IRQ context as well.

This is correct in general, ib_post_send is not allowed to schedule.
isert/srpt might benefit here in latency, but it would require the
the drivers to pre-allocate the sgls (ib_sge's) and use a worst-case
approach (or use GFP_ATOMIC allocations - I'm not sure which is
better...)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/