Re: [PATCH v2 3/3] tty: Use RCU read lock to iterate tasks and threads in __do_SAK()
From: Kirill Tkhai
Date: Thu Jan 18 2018 - 05:11:55 EST
On 17.01.2018 19:54, Eric W. Biederman wrote:
> Kirill Tkhai <ktkhai@xxxxxxxxxxxxx> writes:
>
>> There were made several efforts to make __do_SAK()
>> working in process context long ago, but it does
>> not solves the problem completely. Since __do_SAK()
>> may take tasklist_lock for a long time, the concurent
>> processes, waiting for write lock with interrupts
>> disabled (e.g., forking), get into the same situation
>> like __do_SAK() would have been executed in interrupt
>> context. I've observed several hard lockups on 3.10
>> kernel running 200 containers, caused by long duration
>> of copy_process()->write_lock_irq() after SAK was sent
>> to a tty. Current mainline kernel has the same problem.
>>
>> The solution is to use RCU to iterate processes and threads.
>> Task list integrity is the only reason we taken tasklist_lock
>> before, as tty subsys primitives mostly take it for reading
>> also (e.g., __proc_set_tty). RCU read lock is enough for that.
>> This patch solves the problem and makes __do_SAK() to be
>> not greedy of tasklist_lock. That should prevent hard lockups
>> I've pointed above.
>
> __do_SAK() needs to be 100% accurate. I do not see the rcu_read_lock
> guaranteeing that new processes created while the process list is being
> iterated that happen to have a reference to the tty will be seen.
>
> So I do not believe this is the actual fix to the problem. Especially
> not if we intend to for SAK to remain a secure attention key that
> guarantees no other processes have access to the tty.
As I wrote to your answer to [1/3] SAK does not guarantee that.
See the comment near __do_SAK() header:
"Now, if it would be correct ;-/ The current code has a nasty hole -
it doesn't catch files in flight. We may send the descriptor to ourselves
via AF_UNIX socket, close it and later fetch from socket. FIXME."
My patch does not introduce new races. If there is a race, you may just
prove it by a scheme.
>
>> Signed-off-by: Kirill Tkhai <ktkhai@xxxxxxxxxxxxx>
>> ---
>> drivers/tty/tty_io.c | 4 +++-
>> 1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c
>> index 89326cee2403..55115e65668d 100644
>> --- a/drivers/tty/tty_io.c
>> +++ b/drivers/tty/tty_io.c
>> @@ -2724,7 +2724,9 @@ void __do_SAK(struct tty_struct *tty)
>> task_pid_nr(p), p->comm);
>> send_sig(SIGKILL, p, 1);
>> } while_each_pid_task(session, PIDTYPE_SID, p);
>> + read_unlock(&tasklist_lock);
>>
>> + rcu_read_lock();
>> /* Now kill any processes that happen to have the tty open */
>> for_each_process(p) {
>> if (p->signal->tty == tty) {
>> @@ -2754,7 +2756,7 @@ void __do_SAK(struct tty_struct *tty)
>> kill:
>> send_sig(SIGKILL, p, 1);
>> }
>> - read_unlock(&tasklist_lock);
>> + rcu_read_unlock();
>> #endif
>> }
>>