Re: [PATCH V10 06/12] of: device: Fix overflow of coherent_dma_mask

From: Frank Rowand
Date: Thu Apr 06 2017 - 15:24:48 EST


On 04/06/17 03:24, Robin Murphy wrote:
> On 06/04/17 08:01, Frank Rowand wrote:
>> On 04/04/17 03:18, Sricharan R wrote:
>>> Size of the dma-range is calculated as coherent_dma_mask + 1
>>> and passed to arch_setup_dma_ops further. It overflows when
>>> the coherent_dma_mask is set for full 64 bits 0xFFFFFFFFFFFFFFFF,
>>> resulting in size getting passed as 0 wrongly. Fix this by
>>> passsing in max(mask, mask + 1). Note that in this case
>>> when the mask is set to full 64bits, we will be passing the mask
>>> itself to arch_setup_dma_ops instead of the size. The real fix
>>> for this should be to make arch_setup_dma_ops receive the
>>> mask and handle it, to be done in the future.
>>>
>>> Signed-off-by: Sricharan R <sricharan@xxxxxxxxxxxxxx>
>>> ---
>>> drivers/of/device.c | 2 +-
>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/of/device.c b/drivers/of/device.c
>>> index c17c19d..c2ae6bb 100644
>>> --- a/drivers/of/device.c
>>> +++ b/drivers/of/device.c
>>> @@ -107,7 +107,7 @@ void of_dma_configure(struct device *dev, struct device_node *np)
>>> ret = of_dma_get_range(np, &dma_addr, &paddr, &size);
>>> if (ret < 0) {
>>> dma_addr = offset = 0;
>>> - size = dev->coherent_dma_mask + 1;
>>> + size = max(dev->coherent_dma_mask, dev->coherent_dma_mask + 1);
>>> } else {
>>> offset = PFN_DOWN(paddr - dma_addr);
>>> dev_dbg(dev, "dma_pfn_offset(%#08lx)\n", offset);
>>>
>>
>> NACK.
>>
>> Passing an invalid size to arch_setup_dma_ops() is only part of the problem.
>> size is also used in of_dma_configure() before calling arch_setup_dma_ops():
>>
>> dev->coherent_dma_mask = min(dev->coherent_dma_mask,
>> DMA_BIT_MASK(ilog2(dma_addr + size)));
>> *dev->dma_mask = min((*dev->dma_mask),
>> DMA_BIT_MASK(ilog2(dma_addr + size)));
>>
>> which would be incorrect for size == 0xffffffffffffffffULL when
>> dma_addr != 0. So the proposed fix really is not papering over
>> the base problem very well.
>
> I'm not sure I agree there. Granted, there exist many more problematic
> aspects than are dealt with here (I've got more patches cooking to sort
> out some of the other issues we have with dma-ranges), but considering
> size specifically:
>
> - It is not possible to explicitly specify a range with a size of 2^64
> in DT. If someone does specify a size of 0, they've done a silly thing
> and should not be surprised that it ends badly.
>
> - It *is* perfectly legitimate for bus code (or a previous device
> driver, once we start coming here at probe time) to have set a device's
> DMA mask to 0xffffffffffffffffULL. If this code then blindly overflows
> and infers an invalid size of 0 from that, breaking things in the
> process, that is this code's fault alone. It just so happens that
> nothing managed to trigger the latent problem until patch #7 here shakes
> up the callsites.

The existing code that uses size does not appear capable of dealing with
the case of DMA mask of 0xffffffffffffffffULL since 2^64 does not fit
into size.

The code affected by the DMA mask is not within my area of knowledge, so
take the following with a grain of salt. If a DMA mask of
0xffffffffffffffffULL is provided, would the code still work without error
(though with reduced capability) if the mask was changed to
0xefffffffffffffffULL? I would guess that the location to do so would
be where dev->coherent_dma_mask is set, or some other location that
is not of_dma_configure(). This would just be a temporary workaround.


> Yes, wacky impossible base + size combinations in DT were a theoretical
> problem before, and remain a theoretical problem, but also fall into the
> "how did you ever expect this to work?" category. There's certainly
> plenty more we can do to improve the DT parsing/validation, but that
> still doesn't apply to this path where the information is *not* coming
> from the DT at all.
>
>> I agree that the proper solution involves passing a mask instead
>> of a size to arch_setup_dma_ops().
>
> Having started writing that patch too, I can tell you it's a big bugger
> touching multiple architectures and fixing up various drivers doing
> stupid things, hence why I'm happy with this point fix being the lesser
> of two evils in terms of not holding up this mostly-orthogonal series.
>
> Robin.
>
>>
>> -Frank
>>
>
>