Re: [PATCH v10 4/8] block: Add bio_reset()
From: Jens Axboe
Date: Fri Sep 07 2012 - 18:44:07 EST
On 2012-09-07 16:25, Kent Overstreet wrote:
> On Fri, Sep 07, 2012 at 04:06:45PM -0600, Jens Axboe wrote:
>> On 2012-09-07 15:55, Jens Axboe wrote:
>>> On 2012-09-07 14:58, Kent Overstreet wrote:
>>>> On Thu, Sep 06, 2012 at 07:34:18PM -0600, Jens Axboe wrote:
>>>>> On 2012-09-06 16:34, Kent Overstreet wrote:
>>>>>> Reusing bios is something that's been highly frowned upon in the past,
>>>>>> but driver code keeps doing it anyways. If it's going to happen anyways,
>>>>>> we should provide a generic method.
>>>>>>
>>>>>> This'll help with getting rid of bi_destructor - drivers/block/pktcdvd.c
>>>>>> was open coding it, by doing a bio_init() and resetting bi_destructor.
>>>>>>
>>>>>> This required reordering struct bio, but the block layer is not yet
>>>>>> nearly fast enough for any cacheline effects to matter here.
>>>>>
>>>>> That's an odd and misplaced comment. Was just doing testing today at 5M
>>>>> IOPS, and even years back we've had cache effects for O_DIRECT in higher
>>>>> speed setups.
>>>>
>>>> Ah, I wasn't aware that you were pushing that many iops through the
>>>> block layer - most I've tested myself was around 1M. It wouldn't
>>>> surprise me if cache effects in struct bio mattered around 5M...
>>>
>>> 5M is nothing, just did 13.5M :-)
>>>
>>> But we can reshuffle for now. As mentioned, we're way overdue for a
>>> decent look at cache profiling in any case.
>>
>> No ill effects seen so far, fwiw:
>>
>> read : io=1735.8GB, bw=53690MB/s, iops=13745K, runt= 33104msec
>
> Cool!
>
> I'd be really curious to see a profile. Of the patches I've got queued
> up I don't think anything's going to significantly affect performance
> yet, but I'm hoping the cleanups/immutable bvec stuff/efficient bio
> splitting enables some performance gains.
Got more work to do, but certainly not a problem sharing.
> Well, it certainly will for stacking drivers, but I'm less sure what
> it's going to look like running on just a raw flash device.
>
> My end goal is making generic_make_request handle arbitrary sized bios,
> and have (efficient) splitting happen as required. This'll get rid of a
> bunch of code and complexity in the upper layers, in bio_add_page() and
> elsewhere. More in the stacking drivers - merge_bvec_fn is horrendous to
> support.
It is a nasty interface, in retrospect probably a mistake. As long as we
don't split ever on non-stacking drivers, I don't care too much. And it
would get rid of complexity in those drivers, so that's a nice win.
merge_bvec_fn not only a bad interface, it's also pretty slow...
> I think I might be able to efficiently get rid of the
> segments-after-merging precalculating, and just have segments merged
> once. That'd get rid of a couple fields in struct bio, and get it under
> 2 cachelines last I counted.
It's 2 cachelines now, but reducing is always a great thing. Getting rid
of the repeated recalculate after merge would be a nice win.
> Course, all this doesn't matter as much for 4k bios so it may just be a
> wash for you.
Right, for me it doesn't matter. As long as you don't slow me down :-)
--
Jens Axboe
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/