Re: [PATCH] xen-swiotlb: exchange memory with Xen only when pages are contiguous
From: Boris Ostrovsky
Date: Thu Oct 25 2018 - 12:10:25 EST
On 10/25/18 10:23 AM, Joe Jin wrote:
> On 10/25/18 4:45 AM, Boris Ostrovsky wrote:
>> On 10/24/18 10:43 AM, Joe Jin wrote:
>>> On 10/24/18 6:57 AM, Boris Ostrovsky wrote:
>>>> On 10/24/18 9:02 AM, Konrad Rzeszutek Wilk wrote:
>>>>> On Tue, Oct 23, 2018 at 08:09:04PM -0700, Joe Jin wrote:
>>>>>> Commit 4855c92dbb7 "xen-swiotlb: fix the check condition for
>>>>>> xen_swiotlb_free_coherent" only fixed memory address check condition
>>>>>> on xen_swiotlb_free_coherent(), when memory was not physically
>>>>>> contiguous and tried to exchanged with Xen via
>>>>>> xen_destroy_contiguous_region it will lead kernel panic.
>>>>> s/it will lead/which lead to/?
>>>>>
>>>>>> The correct check condition should be memory is in DMA area and
>>>>>> physically contiguous.
>>>>> "The correct check condition to make Xen hypercall to revert the
>>>>> memory back from its 32-bit pool is if it is:
>>>>> 1) Above its DMA bit mask (for example 32-bit devices can only address
>>>>> up to 4GB, and we may want 4GB+2K), and
>>>> Is this "and' or 'or'?
>>>>
>>>>> 2) If it not physically contingous
>>>>>
>>>>> N.B. The logic in the code is inverted, which leads to all sorts of
>>>>> confusions."
>>>> I would, in fact, suggest to make the logic the same in both
>>>> xen_swiotlb_alloc_coherent() and xen_swiotlb_free_coherent() to avoid
>>>> this. This will involve swapping if and else in the former.
>>>>
>>>>
>>>>> Does that sound correct?
>>>>>
>>>>>> Thank you Boris for pointing it out.
>>>>>>
>>>>> Fixes: 4855c92dbb7 ("xen-sw..") ?
>>>>>
>>>>>> Signed-off-by: Joe Jin <joe.jin@xxxxxxxxxx>
>>>>>> Cc: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
>>>>>> Cc: Boris Ostrovsky <boris.ostrovsky@xxxxxxxxxx>
>>>>> Reported-by: Boris Ostrovs... ?
>>>>>> Cc: Christoph Helwig <hch@xxxxxx>
>>>>>> Cc: Dongli Zhang <dongli.zhang@xxxxxxxxxx>
>>>>>> Cc: John Sobecki <john.sobecki@xxxxxxxxxx>
>>>>>> ---
>>>>>> drivers/xen/swiotlb-xen.c | 4 ++--
>>>>>> 1 file changed, 2 insertions(+), 2 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
>>>>>> index f5c1af4ce9ab..aed92fa019f9 100644
>>>>>> --- a/drivers/xen/swiotlb-xen.c
>>>>>> +++ b/drivers/xen/swiotlb-xen.c
>>>>>> @@ -357,8 +357,8 @@ xen_swiotlb_free_coherent(struct device *hwdev, size_t size, void *vaddr,
>>>>>> /* Convert the size to actually allocated. */
>>>>>> size = 1UL << (order + XEN_PAGE_SHIFT);
>>>>>>
>>>>>> - if (((dev_addr + size - 1 <= dma_mask)) ||
>>>>>> - range_straddles_page_boundary(phys, size))
>>>>>> + if ((dev_addr + size - 1 <= dma_mask) &&
>>>>>> + !range_straddles_page_boundary(phys, size))
>>>>>> xen_destroy_contiguous_region(phys, order);
>>>> I don't think this is right.
>>>>
>>>> if ((dev_addr + size - 1 > dma_mask) || range_straddles_page_boundary(phys, size))
>>>>
>>>> No?
>>> No this is not correct.
>>>
>>> When allocate memory, it tried to allocated from Dom0/Guest, then check if physical
>>> address is DMA memory also contiguous, if no, exchange with Hypervisor, code as below:
>>>
>>> 326 phys = *dma_handle;
>>> 327 dev_addr = xen_phys_to_bus(phys);
>>> 328 if (((dev_addr + size - 1 <= dma_mask)) &&
>>> 329 !range_straddles_page_boundary(phys, size))
>>> 330 *dma_handle = dev_addr;
>>> 331 else {
>>> 332 if (xen_create_contiguous_region(phys, order,
>>> 333 fls64(dma_mask), dma_handle) != 0) {
>>> 334 xen_free_coherent_pages(hwdev, size, ret, (dma_addr_t)phys, attrs);
>>> 335 return NULL;
>>> 336 }
>>> 337 }
>>>
>>>
>>> On freeing, need to return the memory to Xen, otherwise DMA memory will be used
>>> up(this is the issue the patch intend to fix), so when memory is DMAable and
>>> contiguous then call xen_destroy_contiguous_region(), return DMA memory to Xen.
>> So if you want to allocate 1 byte at address 0 (and dev_addr=phys),
>> xen_create_contiguous_region() will not be called. And yet you will call
>> xen_destroy_contiguous_region() in the free path.
>>
>> Is this the expected behavior?
> I could not say it's expected behavior, but I think it's reasonable.
I would expect xen_create_contiguous_region() and
xen_destroy_contiguous_region() to come in pairs. If a region is
created, it needs to be destroyed. And vice versa.
>
> On allocating, it used __get_free_pages() to allocate memory, if lucky the memory is
> DMAable, will not exchange memory with hypervisor, obviously this is not guaranteed.
>
> And on freeing it could not be identified if memory from Dom0/guest own memory
> or hypervisor
I think it can be. if (!(dev_addr + size - 1 <= dma_mask) ||
range_straddles_page_boundary()) then it must have come from the
hypervisor, because that's the check we make in
xen_swiotlb_alloc_coherent().
-boris
> , if don't back memory to hypervisor which will lead hypervisor DMA
> memory be used up, then on Dom0/guest, DMA request maybe failed, the worse thing is
> could not start any new guest.
>
> Thanks,
> Joe
>
>> -boris
>>