Re: ptrace and pseudoterminals
From: Peter Hurley
Date: Thu Nov 05 2015 - 08:25:29 EST
On 11/04/2015 02:43 PM, Oleg Nesterov wrote:
> On 11/04, Peter Hurley wrote:
>>
>> Hi Pavel,
>>
>> On 11/03/2015 06:16 PM, Pavel Labath wrote:
>>> Hello Oleg, everyone,
>>>
>>> I have noticed something, which may be considered a race in the
>>> interaction of ptrace and pseudoterminal interfaces. Basically, what
>>> happens is this:
>>> - we have two processes: A and B. B has the slave end of the pty open,
>>> A has the master. A is tracing B.
>>> - B writes some data through the slave end and then stops.
>>> - A waits for B to stop.
>>> - A does a select on the master pty endpoint. select returns there is
>>> no data available
>>> - later, A tries the select again, and this time the data appears.
>>
>> This happens because a separate kworker processes the input from slave
>> and wakes the master. At the moment of select() on the master pty, the
>> kworker has not processed the latest input (in fact it may only be
>> scheduled and not running yet).
>>
>> Essentially, you're measuring a asynchronous i/o path with a synchronous
>> method.
>
> Thanks a lot Peter!
>
>>> We are encountering this (very rare) issue in our debugger test suite,
>>> where we check the stdout of the tracee to make sure it is behaving as
>>> expected. I have attached a small program reproducing this behavior
>>> (it fails after about 1000 iterations on a 3.13.0 kernel, I can retry
>>> it on a newer kernel next week if you believe it might work there).
>>> Interestingly, when I replace the pty with a regular pipe, it works as
>>> expected (the data is available as soon as the program stops).
>>>
>>> My question is: Is this behavior something that you would consider a
>>> bug? If yes, do you have any pointers, as to where I should look to
>>> fix it?
>>
>> I don't consider it a bug.
>>
>> That said, I could see a couple of different ways to add this
>> functionality:
>> 1. Implement f_op->fsync() for ttys, which would flush the workqueue
>> (thus waiting for i/o completion). The debugger would fsync() before
>> select() on the master.
>> 2. Automagically for ptraced processes. The basic idea would be that
>> writes to the slave end while a process was being ptraced would
>> set state that would trigger workqueue flush by select/poll/read of
>> the master end.
>
> Oh, I don't think "Automagically if ptrace" makes any sense... What makes
> ptrace special? Afaics nothing.
>
> We can modify this test-case to use signals/futexes/whatever to let the
> the parent know that the child has already done write(writefd), and it can
> "fail" the same way.
True.
Also, new patches in mainline head make this _much_ less likely
by scheduling the input processing kworker on the unbound wq (which means
the kworker can start immediately on another cpu rather than pinned to
the cpu performing the slave write).
After thinking more about this, this use-case seems trivially solvable
by re-select()ing with a timeout prior to reporting mismatch output
failure.
Regards,
Peter Hurley
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/