Re: [PATCH 3/4] dt-binding: remoteproc: venus rproc dt binding document

From: Marek Szyprowski
Date: Thu Sep 15 2016 - 04:49:10 EST


Hi Bjorn,


On 2016-09-02 22:12, Bjorn Andersson wrote:
On Fri 02 Sep 04:52 PDT 2016, Marek Szyprowski wrote:

On 2016-09-01 16:58, Stanimir Varbanov wrote:
...
But I presume we have the implementation issue of dma_alloc_coherent()
failing in either case with the 5MB size. I think we need to look into
I'd be good to include Marek Szyprowski? At least he will know what
design restrictions there are.

Please do. The more I look at this the more I think we must use the
existing infrastructure for allocating "dma memory". Getting
dma_alloc_coherent() supporting non-power-of-2 memory regions would
Just to be precise it should be dma_alloc_from_coherent().

Marek, what is your opinion on that, can we make dma_alloc_from_coherent
able to allocate memory for sizes with bigger granularity.

For your convenience here [1] is the mail thread.
There should be no technical restrictions to add support for bigger
granularity than power-of-2. dma_alloc_from_coherent uses standard
bitmap based allocator, so it already support tracking allocations of
arbitrary size.
I believe we should be able to change the parameter of
bitmap_{find_free,release,allocate}_region() to take a size rather than
an order.

The mask used in __reg_op() is an unsigned long, that is stamped over
the region to be masked or cleared, so there are some clear restrictions
in what parameters we can pass there - without having to break this
operation up in steps.

But if drive the offset by taking the next power-of-two of the size and
then align the number of bits to min(count, BITS_PER_LONG) we should
retain the performance characteristics and requirements of __reg_op().

However for the small allocations (smaller than 64KiB?, 512KiB?) it
would make sense to keep nearest-power-of-2 round up to prevent memory
fragmentation.
But in our case each bit matches a single page, so by making sure the
mask always fills the unsigned long in the larger cases we would end up
with having to align things to 128kb (or 256kb if unsigned long is 64
bit).

By preventing memory fragmentation I wanted to align small allocations (less
than the mentioned 64KiB or 512KiB) to nearest-power-of-2 of their size just
like it is done now.

Best regards
--
Marek Szyprowski, PhD
Samsung R&D Institute Poland