Re: [PATCH v9 00/13] firmware: qcom: qseecom: convert to using the TZ allocator

From: Maximilian Luz
Date: Fri Mar 29 2024 - 14:56:37 EST




On 3/29/24 7:53 PM, Maximilian Luz wrote:
On 3/29/24 11:22 AM, Bartosz Golaszewski wrote:
On Thu, Mar 28, 2024 at 7:55 PM Bartosz Golaszewski <brgl@xxxxxxxx> wrote:

On Thu, Mar 28, 2024 at 5:50 PM Maximilian Luz <luzmaximilian@xxxxxxxxx> wrote:

If I understand correctly, it enters an atomic section in
qcom_tzmem_alloc() and then tries to schedule somewhere down the line.
So this shouldn't be qseecom specific.

I should probably also say that I'm currently testing this on a patched
v6.8 kernel, so there's a chance that it's my fault. However, as far as
I understand, it enters an atomic section in qcom_tzmem_alloc() and then
later tries to expand the pool memory with dma_alloc_coherent(). Which
AFAIK is allowed to sleep with GFP_KERNEL (and I guess that that's the
issue here).

I've also tried the shmem allocator option, but that seems to get stuck
quite early at boot, before I even have usb-serial access to get any
logs. If I can find some more time, I'll try to see if I can get some
useful output for that.


Ah, I think it happens here:

+       guard(spinlock_irqsave)(&pool->lock);
+
+again:
+       vaddr = gen_pool_alloc(pool->genpool, size);
+       if (!vaddr) {
+               if (qcom_tzmem_try_grow_pool(pool, size, gfp))
+                       goto again;

We were called with GFP_KERNEL so this is what we pass on to
qcom_tzmem_try_grow_pool() but we're now holding the spinlock. I need
to revisit it. Thanks for the catch!

Bart

Can you try the following tree?

     https://git.codelinaro.org/bartosz_golaszewski/linux.git
topic/shm-bridge-v10

gen_pool_alloc() and gen_pool_add_virt() can be used without external
serialization. We only really need to protect the list of areas in the
pool when adding a new element. We could possibly even use
list_add_tail_rcu() as it updates the pointers atomically and go
lockless.

Thanks! That fixes the allocations for CONFIG_QCOM_TZMEM_MODE_GENERIC=y.
Unfortunately, with the shmbridge mode it still gets stuck at boot (and
I haven't had the time to look into it yet).

And for more bad news: It looks like the new allocator now fully exposes
a bug that I've been tracking down the last couple of days. In short,
uefisecapp doesn't seem to be happy when we split the allocations for
request and response into two, causing commands to fail. Instead it
wants a single buffer for both. Before, it seemed to be fairly sporadic
(likely because kzalloc in sequence just returned consecutive memory
almost all of the time) but now it's basically every call that fails.

I have a fix for that almost ready and I'll likely post it in the next
hour. But that means that you'll probably have to rebase this series
on top of it...

Forgot to mention: I tested it with the fix and this series, and that
works.

Best regards,
Max