Re: Copying TLS/user register data per perf-sample?

From: Beau Belgrave
Date: Thu Apr 11 2024 - 13:27:32 EST


On Fri, Apr 12, 2024 at 12:55:19AM +0900, Masami Hiramatsu wrote:
> On Wed, 10 Apr 2024 08:35:42 -0700
> Beau Belgrave <beaub@xxxxxxxxxxxxxxxxxxx> wrote:
>
> > On Wed, Apr 10, 2024 at 10:06:28PM +0900, Masami Hiramatsu wrote:
> > > On Thu, 4 Apr 2024 12:26:41 -0700
> > > Beau Belgrave <beaub@xxxxxxxxxxxxxxxxxxx> wrote:
> > >
> > > > Hello,
> > > >
> > > > I'm looking into the possibility of capturing user data that is pointed
> > > > to by a user register (IE: fs/gs for TLS on x86/64) for each sample via
> > > > perf_events.
> > > >
> > > > I was hoping to find a way to do this similar to PERF_SAMPLE_STACK_USER.
> > > > I think it could even use roughly the same ABI in the perf ring buffer.
> > > > Or it may be possible by some kprobe linked to the perf sample function.
> > > >
> > > > This would allow a profiler to collect TLS (or other values) on x64. In
> > > > the Open Telemetry profiling SIG [1], we are trying to find a fast way
> > > > to grab a tracing association quickly on a per-thread basis. The team
> > > > at Elastic has a bespoke way to do this [2], however, I'd like to see a
> > > > more general way to achieve this. The folks I've been talking with seem
> > > > open to the idea of just having a TLS value for this we could capture
> > > > upon each sample. We could then just state, Open Telemetry SDKs should
> > > > have a TLS value for span correlation. However, we need a way to sample
> > > > the TLS value(s) when a sampling event is generated.
> > > >
> > > > Is this already possible via some other means? It'd be great to be able
> > > > to do this directly at the perf_event sample via the ABI or a probe.
> > > >
> > >
> > > Have you tried to use uprobes? It should be able to access user-space
> > > registers including fs/gs.
> > >
> >
> > We need to get fs/gs during a sample interrupt from perf. If the sample
> > interrupt lands during kernel code (IE: syscall) we would also like to
> > get these TLS values when in process context.
>
> OK, those are not directly accessible from pt_regs.
>

Yeah, it's a per-arch thread attribute.

> >
> > I have some patches into the kernel to make this possible via
> > perf_events that works well, however, I don't want to reinvent the wheel
> > if there is some way to get these via perf samples already.
>
> I would like to see it. I think it is possible to introduce a helper
> to get a base address of user TLS for probe events, and start supporting
> from x86.
>

For sure, I'm hoping the patches start the right conversations.

> >
> > In OTel, we are trying to attribute samples to transactions that are
> > occurring. So the TLS fetch has to be aligned exactly with the sample.
> > You can do this via eBPF when it's available, however, we have
> > environments where eBPF is not available.
> >
> > It's sounding like to do this properly without eBPF a new feature would
> > be required. If so, I do have some patches I can share in a bit as an
> > RFC.
>
> It is better to be shared in RFC stage, so that we can discuss it from
> the direction level.
>

Agree, it could be that having the ability to run a probe on sample may
be a better option. Not sure.

Thanks,
-Beau

> Thank you,
>
> >
> > Thanks,
> > -Beau
> >
> > > Thank you,
> > >
> > > --
> > > Masami Hiramatsu (Google) <mhiramat@xxxxxxxxxx>
>
>
> --
> Masami Hiramatsu (Google) <mhiramat@xxxxxxxxxx>