Re: n_tty: Check the other end of pty pair before returning EAGAIN on a read()
From: Peter Hurley
Date: Fri Dec 11 2015 - 08:56:20 EST
On 12/11/2015 05:37 AM, Marc Aurele La France wrote:
> On Thu, 10 Dec 2015, Peter Hurley wrote:
>> On 12/10/2015 02:48 PM, Marc Aurele La France wrote:
>>> On Thu, 10 Dec 2015, Peter Hurley wrote:
>>>> On 12/09/2015 01:06 PM, Marc Aurele La France wrote:
>
>>>>> After sshd has been SIGCHLD'ed about the shell's termination, it
>>>>> continues to read the master pty until an error occurs. This error
>>>>> will be EIO if no process has the slave pty open. Otherwise (for
>>>>> example when the shell spawned long-running processes in the
>>>>> background before terminating), that error is expected to be EAGAIN.
>>>>> sshd cannot continue to read until an EIO in all cases, because doing
>>>>> so causes the session to hang until all processes have closed the
>>>>> slave pty, which is not the desired behaviour. Thus a spurious EAGAIN
>>>>> return causes sshd to lose data, whether or not the slave pty is
>>>>> completely closed.
>
>>>> Ah, the games userspace will be up to :)
>
>>> Not really.
>
>> Definitely.
>
>> The idea that a read with O_NONBLOCK set should have synchronous behavior
>> is ridiculous.
>
>>> The fact different OSes behave differently in this regard can
>>> hardly be said to be userland's fault. The lower the number of distinct
>>> behaviours userland needs to deal with, the better. Furthermore, sshd
>>> "knows" there should be data there, so it makes no sense to befuddle it
>>> with false EAGAIN returns.
>
>> But sshd doesn't "know". sshd "knows" the data has been sent and that's all.
>> sshd is extrapolating from one known condition to another unknown condition,
>> and assuming it "should" be that way because it has been.
>
>> For example, try the same idea with real ttys on loopback. Wouldn't work,
>> because it's asynchronous.
>
>> The only reason this needs fixing is because it's a userspace regression.
Misunderstanding.
"userspace regression" = kernel regression observable by userspace
> It's the kernel that introduced this regression, not OpenSSH.
>
> I am not asking to read data before it has been produced. I am puzzled that despite knowing that the data exists, I can now be lied to when I try to retrieve it, when I wasn't before. We are talking about what is essentially a two-way pipe, not some network or serial connection with transmission delays userland has long experience in dealing with.
>
> These previously internal additional delays, that are now exposed to userland, are simply an implementation detail that userland did not, and should not, need to worry about.
Your mental model is that pseudo-terminals are a synchronous pipe, which
is not true.
But this argument is pointless because the regression needs to be fixed
regardless of the merits.
Regards,
Peter Hurley
>> This is just one of those unfortunate situations where userspace has come
>> to rely on an unspecified behavior because it worked.
>
> Whether the behaviour is specified or not is irrelevent. This simply means there is no standard to debunk the fact that the kernel's previous behaviour mimics that of other systems.
>
> So, how am I supposed to avoid these spurious EAGAINs and finally be allowed to read the data I know exists? How long do I have to wait? Do I have to run a calibration loop to figure that out? Why should I need to do that only on Linux?
>
> I don't know, but there's nonsense in here somewhere.
>
> Marc.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/