Re: [PATCH 01/16] remoteproc: Extend rproc_da_to_va() API with a flags parameter

From: Suman Anna
Date: Wed Feb 13 2019 - 22:36:45 EST

Hi Roger,

On 12/4/18 4:03 AM, Roger Quadros wrote:
> On 29/11/18 18:12, David Lechner wrote:
>> On 11/29/18 4:29 AM, Roger Quadros wrote:
>>> Bjorn, Suman,
>>> On 26/11/18 23:29, David Lechner wrote:
>>>> On 11/26/18 1:52 AM, Roger Quadros wrote:
>>>>> From: Suman Anna <s-anna@xxxxxx>
>>>>> The rproc_da_to_va() API is currently used to perform any device
>>>>> to kernel address translations to meet the different needs of the
>>>>> remoteproc core/platform drivers (eg: loading). The function also
>>>>> invokes the da_to_va ops, if present, to allow the remoteproc
>>>>> platform drivers to provide address translation. However, not all
>>>>> platform implementations have linear address spaces, and may need
>>>>> an additional parameter to be able to perform proper translations.
>>>>> The rproc_da_to_va() API and the rproc .da_to_va ops have therefore
>>>>> been expanded to take in an additional flags field enabling some
>>>>> remoteproc implementations (like the TI PRUSS remoteproc driver)
>>>>> to use these flags. Also, define some semantics for this flags
>>>>> argument as this can vary from one implementation to another. A
>>>>> new flags type is encoded into the upper 16 bits along side the
>>>>> actual value in the lower 16-bits for the flags argument, to
>>>>> allow different individual implementations to have better
>>>>> flexibility in interpreting the flags as per their needs.
>>>> This seems like an overly complex solution for a rather simple
>>>> problem. Instead of passing all sorts of flags, could we just add
>>>> a parameter named "page" to da_to_va() that indicates the memory
>>>> page of the address in the remote processor?
>>>> Or perhaps there is some other use for all of these flags that I
>>>> am not aware of?
>>> I'm not a big fan of this patch either.
>>> rproc_da_to_va() is used at the following places
>>> 2 qcom_q6v5_mss.c qcom_q6v5_dump_segment 974 void *ptr = rproc_da_to_va(rproc, segment->da, segment->size,
>>> 3 remoteproc_core.c rproc_da_to_va 197 void *rproc_da_to_va(struct rproc *rproc, u64 da, int len, u32 flags)
>>> 4 remoteproc_core.c rproc_handle_trace 582 ptr = rproc_da_to_va(rproc, rsc->da, rsc->len, RPROC_FLAGS_NONE);
>>> 5 remoteproc_core.c rproc_coredump 1592 ptr = rproc_da_to_va(rproc, segment->da, segment->size,
>>> 6 remoteproc_elf_loader.c rproc_elf_load_segments 185 ptr = rproc_da_to_va(rproc, da, memsz,
>>> 7 remoteproc_elf_loader.c rproc_elf_find_loaded_rsc_table 337 return rproc_da_to_va(rproc, shdr->sh_addr, shdr->sh_size,
>>> At rproc_elf_load_segments() we need to pass enough information so that
>>> the rproc driver can load the segment into proper area (IRAM vs DRAM).
>>> So providing page should suffice.
>> FYI, the PRU series I sent a while back has some patches to do
>> something like this so feel free to use them if they are helpful.
> Thanks. I think we need to do something like that. Too bad you had to reverse engineer
> the TI specific headers. I'll check if we have this available somewhere internally.

I commented on this in your v2 series, but let's see if we can
incorporate some of this custom logic within the PRU remoteproc driver
else. The remoteproc core does allow you to use your own implementations
for some firmware related ops.

>>> I want to understand more about rproc_elf_find_loaded_rsc_table() myself.
>>> rproc_elf_find_loaded_rsc_table() is called only in rproc_start() in remoteproc_core.c
>>> with the comment
>>> /*
>>> * The starting device has been given the rproc->cached_table as the
>>> * resource table. The address of the vring along with the other
>>> * allocated resources (carveouts etc) is stored in cached_table.
>>> * In order to pass this information to the remote device we must copy
>>> * this information to device memory. We also update the table_ptr so
>>> * that any subsequent changes will be applied to the loaded version.
>>> */
>>> loaded_table = rproc_find_loaded_rsc_table(rproc, fw);
>>> Why isn't cached_table sufficient?
>>> Why do we need to call rproc_find_loaded_rsc_table()?
>>> why do we need to load the resource table into remote processor memory at all.
>>> As discussed earlier, some PRU systems have very little memory (512 bytes?)
>>> and we want to avoid unnecessary loading.
> This question still holds.
> Suman?

The resource table is used for publishing back various allocated
resource values to the remote processor, so this needs to be loaded into
memory. For example, with RSC_CARVEOUTs, we fill in the address Linux
kernel allocated. With VDEVs, the virtio config status is shared and
used for synchronization between the Linux host and the remote processor.