Re: [PATCH] proc: Avoid a thundering herd of threads freeing proc dentries
From: Eric W. Biederman
Date: Mon Jun 22 2020 - 11:17:40 EST
Masahiro Yamada <masahiroy@xxxxxxxxxx> writes:
> On Fri, Jun 19, 2020 at 11:14 PM Eric W. Biederman
> <ebiederm@xxxxxxxxxxxx> wrote:
>>
>>
>> Junxiao Bi <junxiao.bi@xxxxxxxxxx> reported:
>> > When debugging some performance issue, i found that thousands of threads exit
>> > around same time could cause a severe spin lock contention on proc dentry
>> > "/proc/$parent_process_pid/task/", that's because threads needs to clean up
>> > their pid file from that dir when exit.
>>
>> Matthew Wilcox <willy@xxxxxxxxxxxxx> reported:
>> > We've looked at a few different ways of fixing this problem.
>>
>> The flushing of the proc dentries from the dcache is an optmization,
>> and is not necessary for correctness. Eventually cache pressure will
>> cause the dentries to be freed even if no flushing happens. Some
>> light testing when I refactored the proc flushg[1] indicated that at
>> least the memory footprint is easily measurable.
>>
>> An optimization that causes a performance problem due to a thundering
>> herd of threads is no real optimization.
>>
>> Modify the code to only flush the /proc/<tgid>/ directory when all
>> threads in a process are killed at once. This continues to flush
>> practically everything when the process is reaped as the threads live
>> under /proc/<tgid>/task/<tid>.
>>
>> There is a rare possibility that a debugger will access /proc/<tid>/,
>> which this change will no longer flush, but I believe such accesses
>> are sufficiently rare to not be observed in practice.
>>
>> [1] 7bc3e6e55acf ("proc: Use a list of inodes to flush from proc")
>> Link: https://lkml.kernel.org/r/54091fc0-ca46-2186-97a8-d1f3c4f3877b@xxxxxxxxxx
>
>
>> Reported-by: Masahiro Yamada <masahiroy@xxxxxxxxxx>
>
> I did not report this.
Thank you for catching this.
I must have cut&pasted the wrong email address by mistake.
My apologies.
Eric