Re: [Xen-devel] [PATCH 2/2] block/xen-blkfront: Handle non-indirect grant with 64KB pages

From: Roger Pau MonnÃ
Date: Mon Oct 19 2015 - 07:16:32 EST


El 12/10/15 a les 20.00, Julien Grall ha escrit:
> On 06/10/15 11:06, Roger Pau Monnà wrote:
>> El 06/10/15 a les 11.58, Julien Grall ha escrit:
>>> Hi Roger,
>>>
>>> On 06/10/2015 10:39, Roger Pau Monnà wrote:
>>>> El 05/10/15 a les 19.05, Julien Grall ha escrit:
>>>>> On 05/10/15 17:01, Roger Pau Monnà wrote:
>>>>>> El 11/09/15 a les 21.32, Julien Grall ha escrit:
>>>>>>> ring_req->u.rw.nr_segments = num_grant;
>>>>>>> + if (unlikely(require_extra_req)) {
>>>>>>> + id2 = blkif_ring_get_request(info, req, &ring_req2);
>>>>>>
>>>>>> How can you guarantee that there's always going to be another free
>>>>>> request? AFAICT blkif_queue_rq checks for RING_FULL, but you don't
>>>>>> actually know if there's only one slot or more than one available.
>>>>>
>>>>> Because the depth of the queue is divided by 2 when the extra request is
>>>>> used (see xlvbd_init_blk_queue).
>>>
>>> I just noticed that I didn't mention this restriction in the commit
>>> message. I will do it in the next revision.
>>>
>>>> I see, that's quite restrictive but I guess it's better than introducing
>>>> a new ring macro in order to figure out if there are at least two free
>>>> slots.
>>>
>>> I actually didn't think about your suggestion. I choose to divide by two
>>> based on the assumption that the block framework will always try to send
>>> a request with the maximum data possible.
>>
>> AFAIK that depends on the request itself, the block layer will try to
>> merge requests if possible, but you can also expect that there are going
>> to be requests that will just contain a single block.
>>
>>> I don't know if this assumption is correct as I'm not fully aware how
>>> the block framework is working.
>>>
>>> If it's valid, in the case of 64KB guest, the maximum size of a request
>>> would be 64KB when indirect segment is not supported. So we would end up
>>> with a lot of 64KB request which will require 2 ring request.
>>
>> I guess you could add a counter in order to see how many requests were
>> split vs total number of requests.
>
> So the number of 64KB request is fairly small compare to the total
> number of request (277 for 4687 request) for general usage (i.e cd, find).
>
> Although as soon as I use dd, the block request will be merged. So I
> guess a common usage will not provide enough data to fill a 64KB request.
>
> Although as soon as I use dd with a block size of 64KB, most of the
> request fill 64KB and an extra request is required.
>
> Note that I had to implement quickly xen_biovec_phys_mergeable for 64KB
> page as I left this aside. Without it, the biovec won't be merge except
> with dd if you specific the block size (bs=64KB).
>
> I've also looked to the code to see if it's possible to check if there
> is 2 ring requests free and if not waiting until they are available.
>
> Currently, we don't need to check if a request if free because the queue
> is sized according to the number of request supported by the ring. This
> means that the block layer is handling the check and we will always have
> space in the ring.
>
> If we decide to avoid dividing the number of request enqueue by the
> block layer, we would have to handle ourself if there is 2 ring requests
> free.
> AFAICT, when BLK_MQ_RQ_BUSY is returned the block layer will stop the
> queue. So we need to have some logic in blkfront to know when the 2 ring
> requests become free and restart the queue. I guest it would be similar
> to gnttab_request_free_callback.
>
> I'd like your advice to know whether this is worth to implement it in
> blockfront given that it will be only used for 64KB guest with backend
> not supporting indirect grant.

At this point I don't think it's worth implementing it, if you feel like
doing that later in order to improve performance that would be fine, but
I don't think it should be required in order to get this merged.

I think you had to resend the patch anyway to fix some comments, but
apart from that:

Acked-by: Roger Pau Monnà <roger.pau@xxxxxxxxxx>

Roger.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/