Re: [PATCH 00/26] AIO performance improvements/cleanups, v2
From: Jens Axboe
Date: Sat Dec 15 2012 - 07:59:48 EST
On 2012-12-15 11:36, Kent Overstreet wrote:
> On Sat, Dec 15, 2012 at 10:46:32AM +0100, Jens Axboe wrote:
>> On 2012-12-15 10:25, Kent Overstreet wrote:
>>> Cool, thanks for the numbers!
>>>
>>> I suspect the difference is due to contention on the ringbuffer,
>>> completion side. You didn't enable my batched completion stuff, did you?
>>
>> No, haven't tried the batching yet.
>>
>>> I suspect the numbers would look quite a bit different with that,
>>> based on my own profiling. If the driver for the device you're testing
>>> on is open source, I'd be happy to do the conversion (it's a 5 minute
>>> job).
>>
>> Knock yourself out - I already took a quick look at it, and conversion
>> should be pretty simple. It's the mtip32xx driver, it's in the kernel. I
>> would suggest getting rid of the ->async_callback() (since it's always
>> bio_endio()) since that'll make it cleaner.
>
> Just pushed my conversion - it's untested, but it's pretty
> straightforward.
Let me give it a whirl. It'll likely improve the situation, since we are
CPU limited at this point. I will definitely add the batching on my side
too, it's a clear win for the (usual) case of reaping multiple
completion events per IRQ.
>>> since I looked at your patch but you're getting rid of the aio
>>> ringbuffer and using a linked list instead, right? My batched completion
>>> stuff should still benefit that case.
>>
>> Yes, I make the ring interface optional. Basically you tell aio to use
>> the ring or not at io_queue_init() time. If you don't care about the
>> ring, we can use a lockless list for the completions.
>
> Yeah, it is a good idea - I'm certainly not attached to the current
> ringbuffer implementation (though a ringbuffer isn't a terrible idea if
> we had one that was implemented correctly).
I agree, ringbuffer isn't necessarily a bad interface. But the fact is
that:
1) Current ringbuffer is broken
2) And nobody uses it
So as an interface, I think it's dead.
>> You completely remove the cancel, I just make it optional for the gadget
>> case. I'm fine with either of them, though I did not look at your usb
>> change in detail. If it's clean, I suspect we should just kill cancel
>> completion as you did.
>
> We (Zach and I) actually made it optional too, more or less - I haven't
> looked at how you did it yet, but in my tree the linked list is there,
> but the kiocb isn't added to the kioctx's list until something sets a
> cancel function.
Sounds like the same approach I took - list is still there, and whether
the node is empty or not signifies whether we need to lock and remove
the entry on completion.
>>> Though - hrm, I'd have expected getting rid of the cancellation linked
>>> list to make a bigger difference and both our patchsets do that.
>>
>> The machine in question runs out of oomph, which is hampering the
>> results. I should have it beefed up next week. It's running E5-2630
>> right now, will move to E5-2690. I think that should make the results
>> clearer.
>
> Well, the fact that it's cpu bound just means the throughput numbers for
> our various patches are different... and what I'm really interested in
> is the profiles, I can't think of any reason cpu speed would affect that
> much.
Sure, that's a given, since they have the same horse power available and
the setup is identical (same process pinning, irq pinning, etc).
--
Jens Axboe
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/