Re: [PATCH] xen/swiotlb: correct the check for xen_destroy_contiguous_region

From: Joe Jin
Date: Tue Apr 28 2020 - 14:50:55 EST


On 4/28/20 10:25 AM, Konrad Rzeszutek Wilk wrote:
> On Tue, Apr 28, 2020 at 12:19:41PM +0200, JÃrgen Groà wrote:
>> On 28.04.20 10:25, Peng Fan wrote:
>
> Adding Joe Jin.
>
> Joe, didn't you have some ideas on how this could be implemented?
>
>>>> Subject: Re: [PATCH] xen/swiotlb: correct the check for
>>>> xen_destroy_contiguous_region
>>>>
>>>> On 28.04.20 09:33, peng.fan@xxxxxxx wrote:
>>>>> From: Peng Fan <peng.fan@xxxxxxx>
>>>>>
>>>>> When booting xen on i.MX8QM, met:
>>>>> "
>>>>> [ 3.602128] Unable to handle kernel paging request at virtual address
>>>> 0000000000272d40
>>>>> [ 3.610804] Mem abort info:
>>>>> [ 3.613905] ESR = 0x96000004
>>>>> [ 3.617332] EC = 0x25: DABT (current EL), IL = 32 bits
>>>>> [ 3.623211] SET = 0, FnV = 0
>>>>> [ 3.626628] EA = 0, S1PTW = 0
>>>>> [ 3.630128] Data abort info:
>>>>> [ 3.633362] ISV = 0, ISS = 0x00000004
>>>>> [ 3.637630] CM = 0, WnR = 0
>>>>> [ 3.640955] [0000000000272d40] user address but active_mm is
>>>> swapper
>>>>> [ 3.647983] Internal error: Oops: 96000004 [#1] PREEMPT SMP
>>>>> [ 3.654137] Modules linked in:
>>>>> [ 3.677285] Hardware name: Freescale i.MX8QM MEK (DT)
>>>>> [ 3.677302] Workqueue: events deferred_probe_work_func
>>>>> [ 3.684253] imx6q-pcie 5f000000.pcie: PCI host bridge to bus 0000:00
>>>>> [ 3.688297] pstate: 60000005 (nZCv daif -PAN -UAO)
>>>>> [ 3.688310] pc : xen_swiotlb_free_coherent+0x180/0x1c0
>>>>> [ 3.693993] pci_bus 0000:00: root bus resource [bus 00-ff]
>>>>> [ 3.701002] lr : xen_swiotlb_free_coherent+0x44/0x1c0
>>>>> "
>>>>>
>>>>> In xen_swiotlb_alloc_coherent, if !(dev_addr + size - 1 <= dma_mask)
>>>>> or range_straddles_page_boundary(phys, size) are true, it will create
>>>>> contiguous region. So when free, we need to free contiguous region use
>>>>> upper check condition.
>>>>
>>>> No, this will break PV guests on x86.
>>>
>>> Could you share more details why alloc and free not matching for the check?
>>
>> xen_create_contiguous_region() is needed only in case:
>>
>> - the bus address is not within dma_mask, or
>> - the memory region is not physically contiguous (can happen only for
>> PV guests)
>>
>> In any case it should arrange for the memory to be suitable for the
>> DMA operation, so to be contiguous and within dma_mask afterwards. So
>> xen_destroy_contiguous_region() should only ever called for areas
>> which match above criteria, as otherwise we can be sure
>> xen_create_contiguous_region() was not used for making the area DMA-able
>> in the beginning.

I agreed with Juergen's explanation, That is my understanding.

Peng, if panic caused by (dev_addr + size - 1 > dma_mask), you should check
how you get the addr, if memory created by xen_create_contiguous_region(),
memory must be with in [0 - dma_mask].

Thanks,
Joe

>>
>> And this is very important in the PV case, as in those guests the page
>> tables are containing the host-PFNs, not the guest-PFNS, and
>> xen_create_contiguous_region() will fiddle with host- vs. guest-PFN
>> arrangements, and xen_destroy_contiguous_region() is reverting this
>> fiddling. Any call of xen_destroy_contiguous_region() for an area it
>> was not intended to be called for might swap physical pages beneath
>> random virtual addresses, which was the reason for this test to be
>> added by me.
>>
>>
>> Juergen
>>
>>>
>>> Thanks,
>>> Peng.
>>>
>>>>
>>>> I think there is something wrong with your setup in combination with the ARM
>>>> xen_create_contiguous_region() implementation.
>>>>
>>>> Stefano?
>>>>
>>>>
>>>> Juergen
>>>>
>>>>>
>>>>> Signed-off-by: Peng Fan <peng.fan@xxxxxxx>
>>>>> ---
>>>>> drivers/xen/swiotlb-xen.c | 4 ++--
>>>>> 1 file changed, 2 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
>>>>> index b6d27762c6f8..ab96e468584f 100644
>>>>> --- a/drivers/xen/swiotlb-xen.c
>>>>> +++ b/drivers/xen/swiotlb-xen.c
>>>>> @@ -346,8 +346,8 @@ xen_swiotlb_free_coherent(struct device *hwdev,
>>>> size_t size, void *vaddr,
>>>>> /* Convert the size to actually allocated. */
>>>>> size = 1UL << (order + XEN_PAGE_SHIFT);
>>>>>
>>>>> - if (!WARN_ON((dev_addr + size - 1 > dma_mask) ||
>>>>> - range_straddles_page_boundary(phys, size)) &&
>>>>> + if (((dev_addr + size - 1 > dma_mask) ||
>>>>> + range_straddles_page_boundary(phys, size)) &&
>>>>> TestClearPageXenRemapped(virt_to_page(vaddr)))
>>>>> xen_destroy_contiguous_region(phys, order);
>>>>>
>>>>>
>>>
>>