Re: [PATCH 1/2] Don't merge different partition's IOs
From: Jens Axboe
Date: Wed Dec 08 2010 - 02:34:16 EST
On 2010-12-07 15:18, Satoru Takeuchi wrote:
> Hi Linus, Yasuaki, and Jens
>
> (2010/12/07 1:08), Linus Torvalds wrote:
>> 2010/12/6 Yasuaki Ishimatsu<isimatu.yasuaki@xxxxxxxxxxxxxx>:
>>>
>>> The problem is caused by merging different partition's I/Os. So the patch
>>> check whether a merging bio or request is a same partition as a request or not
>>> by using a partition's start sector and size.
>>
>> I really think this is wrong.
>>
>> We should just carry the partition information around in the req and
>> the bio, and just compare the pointers, rather than compare the range.
>> No need to even dereference the pointers, you should be able to just
>> do
>>
>> /* don't merge if not on the same partition */
>> if (bio->part != req->part)
>> return 0;
>>
>> or something.
>>
>> This is doubly true since the accounting already does that horrible
>> partition lookup: rather than look it up, we should just _set_ it in
>> __generic_make_request(), where I think we already know it since we do
>> that whole blk_partition_remap().
>>
>> So just something like the appended (TOTALLY UNTESTED) perhaps?
>>
>> Note that this should get it right even for overlapping partitions etc.
>>
>> Linus
>
> The problem can occur even if your patches are applied. Think about a case
> like the following.
>
> 1) There are 2 partition, sda1 and sda2, on sda.
> 2) Open sda and issue an IO to sda2's first sector. Then sda2's in_flight
> is incremented though you open not sda2 but sda. It is because of
> partition lookup method. It is based on which partition rq->__sector
> sector belongs to.
> 3) Issue an IO to sda1's last sector and it merged to the IO issued in
> step (2) because their part are both sda. In addition, rq->__sector
> is modified to the sda1's region.
> 4) After completing the IO, sda1's in_flight is decremented and diskstat
> is corrupted here.
>
> I think fixing this case is difficult and would cause more complexity.
>
> I hit on another approach. Although it doesn'tprevent any merge as Linus
> preferred, it can fix the problem anyway. In this idea, in_flight is
> incremented and decremented for the partition which the request belonged
> to in its creation. It has the following merits.
I really would prefer if we fixed up the patchset we ended up reverting.
At least that had a purpose with growing struct request, since we saved
on doing the partition lookups.
--
Jens Axboe
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/