Re: [PATCH V10 06/12] of: device: Fix overflow of coherent_dma_mask

From: Robin Murphy
Date: Thu Apr 06 2017 - 06:25:21 EST


On 06/04/17 08:01, Frank Rowand wrote:
> On 04/04/17 03:18, Sricharan R wrote:
>> Size of the dma-range is calculated as coherent_dma_mask + 1
>> and passed to arch_setup_dma_ops further. It overflows when
>> the coherent_dma_mask is set for full 64 bits 0xFFFFFFFFFFFFFFFF,
>> resulting in size getting passed as 0 wrongly. Fix this by
>> passsing in max(mask, mask + 1). Note that in this case
>> when the mask is set to full 64bits, we will be passing the mask
>> itself to arch_setup_dma_ops instead of the size. The real fix
>> for this should be to make arch_setup_dma_ops receive the
>> mask and handle it, to be done in the future.
>>
>> Signed-off-by: Sricharan R <sricharan@xxxxxxxxxxxxxx>
>> ---
>> drivers/of/device.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/of/device.c b/drivers/of/device.c
>> index c17c19d..c2ae6bb 100644
>> --- a/drivers/of/device.c
>> +++ b/drivers/of/device.c
>> @@ -107,7 +107,7 @@ void of_dma_configure(struct device *dev, struct device_node *np)
>> ret = of_dma_get_range(np, &dma_addr, &paddr, &size);
>> if (ret < 0) {
>> dma_addr = offset = 0;
>> - size = dev->coherent_dma_mask + 1;
>> + size = max(dev->coherent_dma_mask, dev->coherent_dma_mask + 1);
>> } else {
>> offset = PFN_DOWN(paddr - dma_addr);
>> dev_dbg(dev, "dma_pfn_offset(%#08lx)\n", offset);
>>
>
> NACK.
>
> Passing an invalid size to arch_setup_dma_ops() is only part of the problem.
> size is also used in of_dma_configure() before calling arch_setup_dma_ops():
>
> dev->coherent_dma_mask = min(dev->coherent_dma_mask,
> DMA_BIT_MASK(ilog2(dma_addr + size)));
> *dev->dma_mask = min((*dev->dma_mask),
> DMA_BIT_MASK(ilog2(dma_addr + size)));
>
> which would be incorrect for size == 0xffffffffffffffffULL when
> dma_addr != 0. So the proposed fix really is not papering over
> the base problem very well.

I'm not sure I agree there. Granted, there exist many more problematic
aspects than are dealt with here (I've got more patches cooking to sort
out some of the other issues we have with dma-ranges), but considering
size specifically:

- It is not possible to explicitly specify a range with a size of 2^64
in DT. If someone does specify a size of 0, they've done a silly thing
and should not be surprised that it ends badly.

- It *is* perfectly legitimate for bus code (or a previous device
driver, once we start coming here at probe time) to have set a device's
DMA mask to 0xffffffffffffffffULL. If this code then blindly overflows
and infers an invalid size of 0 from that, breaking things in the
process, that is this code's fault alone. It just so happens that
nothing managed to trigger the latent problem until patch #7 here shakes
up the callsites.

Yes, wacky impossible base + size combinations in DT were a theoretical
problem before, and remain a theoretical problem, but also fall into the
"how did you ever expect this to work?" category. There's certainly
plenty more we can do to improve the DT parsing/validation, but that
still doesn't apply to this path where the information is *not* coming
from the DT at all.

> I agree that the proper solution involves passing a mask instead
> of a size to arch_setup_dma_ops().

Having started writing that patch too, I can tell you it's a big bugger
touching multiple architectures and fixing up various drivers doing
stupid things, hence why I'm happy with this point fix being the lesser
of two evils in terms of not holding up this mostly-orthogonal series.

Robin.

>
> -Frank
>