Re: v4.4-rc1: /dev/console open fails with -EIO
From: bhe@xxxxxxxxxx
Date: Wed Dec 16 2015 - 09:24:01 EST
Hi Junichi,
A little earlier Peter Hurley has posted a patch to fix this problem.
https://lkml.org/lkml/2015/11/27/546
It may be found firstly on arm by Pratyush Anand <panand@xxxxxxxxxx>.
I found it too this week on Fedora 23.
Anyway, it's great problem has been fixed very quickly. Just reply to
let you know this.
Thanks
Baoquan
On 12/16/15 at 06:32am, Junichi Nomura wrote:
> Since kernel v4.4-rc1, kdump capture service with Fedora23 / RHEL7.2
> almost always fails on my test system which uses serial console. It
> used to work fine until kernel v4.3.
>
> Kdump fails with an error like this:
> kdump.sh[1040]: /bin/kdump.sh: line 8: /dev/console: Input/output error
>
> The line 8 of kdump.sh is doing this:
> exec &> /dev/console
> (http://pkgs.fedoraproject.org/cgit/kexec-tools.git/tree/dracut-kdump.sh)
>
> and the EIO is returned by this code in tty_reopen():
> if (!tty->count)
> return -EIO;
>
> Bisection tells that commit 79c1faa4511e ("tty: Remove
> tty_wait_until_sent_from_close()") is the first bad commit.
> Actually, after reverting the commit, kdump capture starts working
> again.
>
> Open of /dev/console used to return -EIO when it races with close.
> (https://bugs.launchpad.net/ubuntu/+source/linux/+bug/554172/comments/245)
> But the commit seems widening the race window.
>
> Before the commit:
> tty_release()
> tty_lock(tty)
> tty->ops->close(tty, filp)
> tty_unlock(tty)
> tty_wait_until_sent()
> // the window starts from here
> tty_lock(tty)
> decrement tty->count
> tty_unlock(tty)
> (releasing tty if count became zero)
>
> After the commit
> tty_release()
> // the window starts from here
> tty_lock(tty)
> tty->ops->close(tty, filp)
> tty_wait_until_sent()
> decrement tty->count
> tty_unlock(tty)
> (releasing tty if count became zero)
>
> While it might be possible for user space to cope with the problem
> by retrying open(), there is no clue whether and how long it should.
> Also current situation makes shell scripting like the above kdump.sh
> fragile for this sort of timing change.
>
> How about retrying tty_open in kernel instead, like the attached patch?
> If !tty->count in tty_reopen() means the race has happened, that
> seems reasonable.
>
> ---
> Jun'ichi Nomura, NEC Corporation
>
> diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c
> index bcc8e1e..070ea66 100644
> --- a/drivers/tty/tty_io.c
> +++ b/drivers/tty/tty_io.c
> @@ -1462,8 +1462,9 @@ static int tty_reopen(struct tty_struct *tty)
> {
> struct tty_driver *driver = tty->driver;
>
> + /* We cannot re-open tty which is being released. */
> if (!tty->count)
> - return -EIO;
> + return -ERESTARTSYS;
>
> if (driver->type == TTY_DRIVER_TYPE_PTY &&
> driver->subtype == PTY_TYPE_MASTER)
> @@ -2087,6 +2088,11 @@ retry_open:
>
> if (IS_ERR(tty)) {
> retval = PTR_ERR(tty);
> + if (retval == -ERESTARTSYS && !signal_pending(current)) {
> + tty_free_file(filp);
> + schedule();
> + goto retry_open;
> + }
> goto err_file;
> }
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/