Re: [PATCH 0/8] unwind, arm64: add sframe unwinder for kernel

From: Indu Bhagat
Date: Sun Mar 09 2025 - 10:44:34 EST


On 2/27/25 10:47 PM, Indu Bhagat wrote:
On 2/27/25 1:38 AM, Puranjay Mohan wrote:
Indu Bhagat <indu.bhagat@xxxxxxxxxx> writes:

On 2/26/25 2:23 AM, Puranjay Mohan wrote:
Indu Bhagat <indu.bhagat@xxxxxxxxxx> writes:

On 2/25/25 3:54 PM, Weinan Liu wrote:
On Tue, Feb 25, 2025 at 11:38 AM Indu Bhagat <indu.bhagat@xxxxxxxxxx> wrote:

On Mon, Feb 10, 2025 at 12:30 AM Weinan Liu <wnliu@xxxxxxxxxx> wrote:
I already have a WIP patch to add sframe support to the kernel module.
However, it is not yet working. I had trouble unwinding frames for the
kernel module using the current algorithm.

Indu has likely identified the issue and will be addressing it from the
toolchain side.

https://sourceware.org/bugzilla/show_bug.cgi?id=32666

I have a working in progress patch that adds sframe support for kernel
module.
https://github.com/heuza/linux/tree/sframe_unwinder.rfc

According to the sframe table values I got during runtime testing, looks
like the offsets are not correct .


I hope to sanitize the fix for 32666 and post upstream soon (I had to
address other related issues).  Unless fixed, relocating .sframe
sections using the .rela.sframe is expected to generate incorrect output.

When unwind symbols init_module(0xffff80007b155048) from the kernel
module(livepatch-sample.ko), the start_address of the FDE entries in the
sframe table of the kernel modules appear incorrect.

init_module will apply the relocations on the .sframe section, isnt it ?

For instance, the first FDE's start_addr is reported as -20564. Adding
this offset to the module's sframe section address (0xffff80007b15a040)
yields 0xffff80007b154fec, which is not within the livepatch- sample.ko
memory region(It should be larger than 0xffff80007b155000).


Hmm..something seems off here.  Having tested a potential fix for 32666
locally, I do not expect the first FDE to show this symptom.



Hi,

Sorry for not responding in the past few days.  I was on PTO and was
trying to improve my snowboarding technique, I am back now!!

I think what we are seeing is expected behaviour:

   | For instance, the first FDE's start_addr is reported as -20564. Adding
   | this offset to the module's sframe section address (0xffff80007b15a040)
   | yields 0xffff80007b154fec, which is not within the livepatch- sample.ko
   | memory region(It should be larger than 0xffff80007b155000).


Let me explain using a __dummy__ example.

Assume Memory layout before relocation:

   | Address | Element                                 | Relocation
   |  ....   | ....                                    |
   |   60    | init_module (start address)             |
   |   72    | init_module (end address)               |
   |  ....   | .....                                   |
   |   100   | Sframe section header start address     |
   |   128   | First FDE's start address               | RELOC_OP_PREL -> Put init_module address (60) - current address (128)

So, after relocation First FDE's start address has value 60 - 128 = -68


For SFrame FDE function start address is :

"Signed 32-bit integral field denoting the virtual memory address of the
described function, for which the SFrame FDE applies.  The value encoded
in the ‘sfde_func_start_address’ field is the offset in bytes of the
function’s start address, from the SFrame section."

So, in your case, after applying the relocations, you will get:
S + A - P = 60 - 128 = -68

This is the distance of the function start address (60) from the current
location in SFrame section (128)

But what we intend to store is the distance of the function start
address from the start of the SFrame section.  So we need to do an
additional step for SFrame FDE:  Value += r_offset

Thanks for the explaination, now it makes sense.

But I couldn't find a relocation type in AARCH64 that does this extra +=
r_offset along with PREL32.

The kernel's module loader is only doing the R_AARCH64_PREL32 which is
why we see this issue.

How is this working even for the kernel itself? or for that matter, any
other binary compiled with sframe?


For the usual executables or shared objects, the calculations are applied by ld.bfd at this time.  Hence, the issue manifests in relocatable files.

 From my limited undestanding, the way to fix this would be to hack the
relocator to do this additional step while relocating .sframe sections.
Or the 'addend' values in .rela.sframe should already have the +r_offset
added to it, then no change to the relocator would be needed.


Of the two, adjusting the addend values in .rela.sframe may be a reasonable way to go about it.  Let me try it out in GAS and ld.bfd.


A fix for this is in the works (being discussed on the binutils@sourceware list). I will keep you posted.

Thanks
Indu