Re: [PATCH 0/1] arm64: defconfig: Add Coresight as module

From: James Clark
Date: Thu Sep 22 2022 - 05:34:56 EST




On 21/09/2022 16:08, Catalin Marinas wrote:
> On Wed, Sep 21, 2022 at 03:05:34PM +0100, James Clark wrote:
>> As suggested by Catalin here's the change to add Coresight to defconfig.
>>
>> Unfortunately I don't think we should add CONFIG_CORESIGHT_SOURCE_ETM4X
>> which builds a few files until [1] is merged because of the overhead
>> of CONFIG_PID_IN_CONTEXTIDR.
>>
>> [1]: https://lore.kernel.org/lkml/20211021134530.206216-1-leo.yan@xxxxxxxxxx/T/
>
> I thought the overhead wasn't the problem, it's mostly negligible. We
> can probably save a few more cycles on the __switch_to() path by
> replacing several isb()s in those functions with a single one just
> before cpu_switch_to().
>
> IIRC the issue is that unless a process runs in the root pid namespace,
> the actual pid written to contextidr is meaningless.

This is true, and Leo has recently disabled it in that scenario in
aab473867fed.

>
> Now that you reminded me of that thread, I see three options (sorry, not
> entirely related to the defconfig updates):
>
> 1. Remove CONFIG_PID_IN_CONTEXTIDR and corresponding code completely,
> find other events to correlate the task with the trace.

Unfortunately when tracing per core we would need kernel timestamps in
the trace to correlate to the switch records. At the moment Coresight is
using a different clock source so it's not possible and we're still
using the context ID to correlate samples.

With FEAT_TRF in v8.4 it will be possible to do this and we've started
working towards that here: 0f00b223ea22. But we'd still have to support
older hardware too, so CONFIG_PID_IN_CONTEXTIDR can't be removed completely.

For SPE it's not required because we already have the right timestamps
in the samples and we've added support for no context IDs in the decoder
here: 27d113cfe892

>
> 2. Always on CONFIG_PID_IN_CONTEXTIDR (we might as well remove the
> Kconfig entry). This would write the root pid namespace value
> (task_pid_nr()).

If we're not worried about the overhead after all, this would be the
easiest solution. And then SPE or Coresight already decide whether they
want to use the value or not, so no further changes are needed.

>From Leo's patch there is a table that shows a 1% overhead with it
enabled permanently, and I've heard a figure like that mentioned before.
So I could also resurrect that patch to use static keys? Although it's a
bit more complicated, that would be my preference. And then we can have
that mode always on.

>
> 3. Similar to (2) but instead write task_pid_nr_ns(). An alternative
> here is to write -1 if the task is not in the root pid namespace.
>
> Strong preference for (1).
>