Re: [PATCH 3/5] ARM: trusted_foundations: do not use naked function

From: Robin Murphy
Date: Thu Mar 22 2018 - 07:48:46 EST


On 21/03/18 21:41, Stefan Agner wrote:
On 21.03.2018 18:16, Robin Murphy wrote:
On 21/03/18 16:40, Stephen Warren wrote:
On 03/21/2018 09:26 AM, Dmitry Osipenko wrote:
On 21.03.2018 17:09, Stefan Agner wrote:
On 21.03.2018 13:13, Robin Murphy wrote:
On 20/03/18 23:02, Stefan Agner wrote:
As documented in GCC naked functions should only use Basic asm
syntax. The Extended asm or mixture of Basic asm and "C" code is
not guaranteed. Currently this works because it was hard coded
to follow and check GCC behavior for arguments and register
placement.

Furthermore with clang using parameters in Extended asm in a
naked function is not supported:
ÂÂÂ arch/arm/firmware/trusted_foundations.c:47:10: error: parameter
ÂÂÂÂÂÂÂÂÂÂÂ references not allowed in naked functions
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ : "r" (type), "r" (arg1), "r" (arg2)
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ ^

Use a regular function to be more portable. This aligns also with
the other smc call implementations e.g. in qcom_scm-32.c and
bcm_kona_smc.c.

Additionally also make sure all callee-saved registers get saved
as it has been done before.

Signed-off-by: Stefan Agner <stefan@xxxxxxxx>
---
ÂÂ arch/arm/firmware/trusted_foundations.c | 12 +++++++-----
ÂÂ 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/arch/arm/firmware/trusted_foundations.c b/arch/arm/firmware/trusted_foundations.c
index 3fb1b5a1dce9..426d732e6591 100644
--- a/arch/arm/firmware/trusted_foundations.c
+++ b/arch/arm/firmware/trusted_foundations.c
@@ -31,21 +31,23 @@
ÂÂÂÂ static unsigned long cpu_boot_addr;
ÂÂ -static void __naked tf_generic_smc(u32 type, u32 arg1, u32 arg2)
+static void tf_generic_smc(u32 type, u32 arg1, u32 arg2)
ÂÂ {
+ÂÂÂ register u32 r0 asm("r0") = type;
+ÂÂÂ register u32 r1 asm("r1") = arg1;
+ÂÂÂ register u32 r2 asm("r2") = arg2;
+
ÂÂÂÂÂÂ asm volatile(
ÂÂÂÂÂÂÂÂÂÂ ".arch_extensionÂÂÂ sec\n\t"
-ÂÂÂÂÂÂÂ "stmfdÂÂÂ sp!, {r4 - r11, lr}\n\t"
ÂÂÂÂÂÂÂÂÂÂ __asmeq("%0", "r0")
ÂÂÂÂÂÂÂÂÂÂ __asmeq("%1", "r1")
ÂÂÂÂÂÂÂÂÂÂ __asmeq("%2", "r2")
ÂÂÂÂÂÂÂÂÂÂ "movÂÂÂ r3, #0\n\t"
ÂÂÂÂÂÂÂÂÂÂ "movÂÂÂ r4, #0\n\t"
ÂÂÂÂÂÂÂÂÂÂ "smcÂÂÂ #0\n\t"
-ÂÂÂÂÂÂÂ "ldmfdÂÂÂ sp!, {r4 - r11, pc}"
ÂÂÂÂÂÂÂÂÂÂ :
-ÂÂÂÂÂÂÂ : "r" (type), "r" (arg1), "r" (arg2)
-ÂÂÂÂÂÂÂ : "memory");
+ÂÂÂÂÂÂÂ : "r" (r0), "r" (r1), "r" (r2)
+ÂÂÂÂÂÂÂ : "memory", "r3", "r4", "r5", "r6", "r7", "r8", "r9", "r10");

I may be missing a subtlety, but it looks like we no longer have a
guarantee that r11 will be caller-saved as it was previously. I don't
know the Trusted Foundations ABI to say whether that matters or not,
but if it is the case that it never needed preserving anyway, that
might be worth calling out in the commit message.

Adding r11 (fp) to the clobber list causes an error when using gcc and
CONFIG_FRAME_POINTER=y:
arch/arm/firmware/trusted_foundations.c: In function âtf_generic_smcâ:
arch/arm/firmware/trusted_foundations.c:51:1: error: fp cannot be used
in asm here

Not sure what ABI Trusted Foundations follow.

[adding Stephen, Thierry and Dmitry]
Maybe someone more familiar with NVIDIA Tegra SoCs can help?

When CONFIG_FRAME_POINTER=y fp gets saved anyway. So we could add r11 to
clobber list ifndef CONFIG_FRAME_POINTER...

I have no idea about TF ABI either. Looking at the downstream kernel code, r4 -
r12 should be saved. I've CC'd Alexandre as he is the author of the original
patch and may still remember the details.

I'm also wondering why original code doesn't have r3 in the clobber list and why
r3 is set to '0', downstream sets it to the address of SP and on return from SMC
r3 contains the address of SP which should be restored. I'm now wondering how
SMC calling worked for me at all on T30, maybe it didn't..

I don't know what the ABI for ATF is. I assume it's documented in the ATF, PSCI, or similar specification, or ATF source code. Hence, I don't know whether ATF restores fp/r11.

Oops, I think we're starting to diverge here - "ATF" (as in "Arm
Trusted Firmware") does implement the ARM SMCCC, which more or less
just follows the regular procedure call standard in terms of register
saving. The "TF" in question here is "Trusted Foundations" from
Trusted Logic (who apparently don't exist any more) which is
explicitly called out in the header as having its own nonstandard
calling convention. I guess newer Tegras are using the former, whereas
the older ones used the latter.


What do you mean by "called out in the header as having its own
nonstandard"?

Specifically, the comment in arch/arm/include/asm/trusted_foundations.h which says:

"The calls are completely specific to Trusted Foundations, and do
*not* follow the SMC calling convention or the PSCI standard."

It is unclear what ABI is used, I just inferred from the fact that
register have been saved before that it might use a nonstandard calling
convention.

Tegra 4i/TK1 and newer seem to use something called Trusted Little
Kernel.

My guess is that r3/r4 are set to 0 because they're defined as inputs by the SMC/ATF ABI, yet nothing the kernel does needed that many parameters, so they're hard-coded to 0 (to ensure they're set to something predictable) rather than also being parameters to tf_generic_smc().

The original code used to save/restore a lot of registers, including r11/fp. Can't we side-step the issue of including/not-including r11/fp in the clobber list by not removing those stmfd/ldmfd assembly instructions?

That might be reasonable - fiddling with a C function's stack inside
an asm is a bit grim, but for this case I can't see that it would mess
with unwinding etc. or otherwise go wrong any more than the existing
code, and I doubt the slight efficiency hit from having to change the
"pop the LR straight into the PC" idiom matters much.

Sounds reasonable, I guess in that case we can also omit all the
additional register in the clobber list.

Yeah, you should only need to specify clobbers for any registers which are neither used as arguments nor explicitly preserved - looking at the layout of the code, it seems unlikely that the compiler would have anything live in r3 or r12 across the call (since the scope for inlining is pretty trivial), but there's no harm in being strictly correct :)

Robin.