RE: [PATCH bisected regression] input_available_p() sometimes says 'no' when it should say 'yes'
From: Nic Percival
Date: Tue May 05 2015 - 08:04:29 EST
There is only ever one debuggee process.
My original demo (and indeed the original test failure) is not threaded. The debugger is multi-threaded.
I've brought in Chris, Fletch and Paul, my immediate colleagues, into the discussion.
The email thread is getting a little tangled, however, from my standpoint I have..
1) poll tells us we have nothing to read on a pty, when we know something was written into the other end.
2) Given that 'poll' is not telling us that data has been written into the pty, what can we use? Surely that is what poll is for.
3) If a debuggee program has displayed 'how old are you?' and then hit a breakpoint on the 'ACCEPT' response, then the question might very well not be displayed, despite the debugger sitting on the statement some way subsequent to the display.
4) If I understand correctly, the modification is a performance enhancement. Obviously in the case of 'ptrace' debugging, performance is not a requirement.
5) Given 'xterm' use pty's, could a scenario happen where a user is prompted 'How old are you?' in the xterm, but an input (getchar, whatever) is hit before that output is displayed? With or without ptrace?
Thanks,
Nic
-----Original Message-----
From: Peter Hurley [mailto:peter@xxxxxxxxxxxxxxxxxx]
Sent: 05 May 2015 12:19
To: Nic Percival; Michael Matz
Cc: NeilBrown; Greg Kroah-Hartman; Jiri Slaby; linux-kernel@xxxxxxxxxxxxxxx
Subject: Re: [PATCH bisected regression] input_available_p() sometimes says 'no' when it should say 'yes'
A: No.
Q: Should I include quotations after my reply?
http://daringfireball.net/2007/07/on_top
On 05/05/2015 04:20 AM, Nic Percival wrote:
> Michael is correct.
> Our COBOL debugger has a test feature whereby we can drive it to step through debugging code, hitting breakpoints and so on.
> The debugger maintains a 'user screen' which is what the 'debuggee' process has displayed.
> This is communicated to the debugger with pseudo-tty's.
> The state of this user screen is checked as part of this (and other) tests.
So the debugger doesn't display output from other non-TRACEME threads or child processes of the debuggee, right?
When that's fixed, you'll see that the "test failure" has gone away.
> The actual test failure is a failure of some text to be displayed on the debuggee user screen when we know, given it has hit a certain breakpoint, that the text has been written.
>
> What is worse is its non-deterministic.
That your test is non-deterministic stems from the fact that the i/o is asynchronous.
You would experience the same problem if your test setup was a tty in loopback.
> Sometimes the text makes it and is displayed, so it wouldn't even be practical to modify the test to make it pass.
> We wouldn't really want to do that anyway - the test is just fine on other earlier SUSE, on RedHat (intel and 390), HP/UX, AIX and Solaris.
There is a reason Linux is the platform of choice for scalability.
Regards,
Peter Hurley
> -----Original Message-----
> From: Michael Matz [mailto:matz@xxxxxxx]
> Sent: 04 May 2015 13:24
> To: Peter Hurley
> Cc: NeilBrown; Nic Percival; Greg Kroah-Hartman; Jiri Slaby;
> linux-kernel@xxxxxxxxxxxxxxx
> Subject: Re: [PATCH bisected regression] input_available_p() sometimes says 'no' when it should say 'yes'
>
> Hi,
>
> On Fri, 1 May 2015, Peter Hurley wrote:
>
>> I don't think this a real bug, in the sense that pty i/o is not
>> synchronous, in the same way that tty i/o is not synchronous.
>
> Here's what I wrote internally about my speculations about this being a bug or not:
>
>>> I also never hit it with pipes (remove the USEPTY define), also not
>>> on sle12, so it must be some change specific to the pty implementation.
>>>
>>> Now, all of this is of course unspecified. There are two
>>> asynchronous processes involved, and a buffered tube between them.
>>> Just because one process filled one end of the tube (the breakpoint
>>> was hit) doesn't mean the contents have to appear at that instant at
>>> the other end. So the change in behaviour in sle12 is not a genuine
>>> bug. It _might_ be an unintented change, though, that's why kernel
>>> people should comment on this. If there are no terribly good
>>> reasons for this change I'd consider it a quality-of-implementation
>>> regression in sle12.
>
> So, I'd accept this being declared a non-bug, but it is certainly a change in behaviour that's visible for our debugger team.
>
>> However, that said, if this is a regression (regression as in "it
>> broke something that used to work", not regression as in "this new
>> thing I'm writing doesn't behave the way I want it to" :) )
>>
>> Help me understand the use-case here: are you using pty i/o to debug
>> the debugger?
>
> Nic is working on the Cobol debugger, but I think this pty i/o is rather a part of the normal interaction between a debugged Cobol process and the debugger; that's just a theory, Nic is authorative here. But this change in behaviour _did_ result in real testsuite regressions, so it's not something that he wanted to write from scratch.
>
> (FWIW: I do think it's a better QoI factor if something returns data
> from a tube if we can know via side channels (break points) that
> something must have been written locally to the other end of the tube,
> if that can be ensured without too much other work)
>
>
> Ciao,
> Michael.
>
>
> This message has been scanned for malware by Websense.
> www.websense.com
>