Re: [PATCH RFC 0/2] Dynamically allocate memory to store task's full name

From: Bhupesh Sharma
Date: Tue Mar 18 2025 - 07:21:31 EST


Hi,

Thanks for the review and inputs on the additional possible use-cases.
Please see my replies inline.

On 3/15/25 1:13 PM, Andres Rodriguez wrote:


On 3/14/25 14:25, Kees Cook wrote:
On Fri, Mar 14, 2025 at 10:57:13AM +0530, Bhupesh wrote:
While working with user-space debugging tools which work especially
on linux gaming platforms, I found that the task name is truncated due
to the limitation of TASK_COMM_LEN.

For example, currently running 'ps', the task->comm value of a long
task name is truncated due to the limitation of TASK_COMM_LEN.
     create_very_lon

This leads to the names passed from userland via pthread_setname_np()
being truncated.

So there have been long discussions about "comm", and it mainly boils
down to "leave it alone". For the /proc-scraping tools like "ps" and
"top", they check both "comm" and "cmdline", depending on mode. The more
useful (and already untruncated) stuff is in "cmdline", so I suspect it
may make more sense to have pthread_setname_np() interact with that
instead. Also TASK_COMM_LEN is basically considered userspace ABI at
this point and we can't sanely change its length without breaking the
world.


Completely agree that comm is best left untouched. TASK_COMM_LEN is embedded into the kernel and the pthread ABI changes here should be avoided.


So, basically my approach _does not_ touch TASK_COMM_LEN at all. The normal 'TASK_COMM_LEN' 16byte design remains untouched.
Which means that all the legacy / existing ABi which uses 'task->comm' and hence are designed / written to handle 'TASK_COMM_LEN' 16-byte name, continue to work as before using '/proc/$pid/task/$tid/comm'.

This change-set only adds a _parallel_ dynamically allocated 'task->full_name' which can be used by interested users via '/proc/$pid/task/$tid/full_name'.

[PATCH 2/2] shows only a possible use-case of the same and can be dropped with only [PATCH 1/2] being considered to add the '/proc/$pid/task/$tid/full_name' interface.
Best to use /proc/$pid/task/$tid/cmdline IMO...

Your recommendation works great for programs like ps and top, which are
the examples proposed in the cover letter. However, I think the opening email didn't point out use cases where the name is modified at runtime. In those cases cmdline would be an unsuitable solution as it should remain immutable across the process lifetime. An example of this use case would be to set a thread's name for debugging purposes and then trying to query it via gdb or perf.

I wrote a quick and dirty example to illustrate what I mean:
https://github.com/lostgoat/tasknames

I think an alternative approach could be to have a separate entry in procfs to store a tasks debug name (and leave comm completely untouched), e.g. /proc/$pid/task/$tid/debug_name. This would allow userspace apps to be updated with the following logic:

get_task_debug_name() {
    if ( !is_empty( debug_name ) )
        return debug_name;
    return comm;
}

"Legacy" userspace apps would remain ABI compatible as they would just fall back to comm. And apps that want to opt in to the new behaviour can be updated one at a time. Which would be work intensive, but even just updating gdb and perf would be super helpful.

I am fine with adding either '/proc/$pid/task/$tid/full_name' or '/proc/$pid/task/$tid/debug_name' (actually both of these achieve the same).
The new / modified users (especially the debug applications you listed above) can switch easily to using '/proc/$pid/task/$tid/full_name' instead of ''/proc/$pid/task/$tid/comm'

AFAIK we already achieved for the kthreads using d6986ce24fc00 ("kthread: dynamically allocate memory to store kthread's full name"), which adds 'full_name' in parallel to 'comm' for kthread names.

Thanks,
Bhupesh