Re: [PATCH] Add "-e" option to rpc.gssd to allow error on ticketexpiry. Try 2 with added man pages.

From: Trond Myklebust
Date: Fri Nov 18 2011 - 15:34:16 EST


On Fri, 2011-11-18 at 20:19 +0100, John Hughes wrote:
> On 11/18/2011 07:35 PM, Trond Myklebust wrote:
> > On Fri, 2011-11-18 at 15:34 +0100, John Hughes wrote:
> >
> >> Description: Add "-e" (ticket expiry is error) option to rpc.gssd
> >> In kernels starting around 2.6.34 the nfs4 server will block all I/O
> >> when a user ticket expires. In earlier kernels the I/O would fail
> >> with an EACCESS error. This patch adds a "-e" option to rpc.gssd
> >> which allow the earlier behaviour (EKEYEXPIRED is converted to
> >> EACCESS). This behaviour is particularly useful when user home
> >> directories are nfs4 mounted with krb5 security - if the user is
> >> absent from their workstation for long enough for the ticket to
> >> expire a new ticket will be obtained (via pam_krb5) by the screen
> >> unlock process.
> >>
> > You need a big fat warning somewhere that enabling this option WILL
> > cause data corruption...
> >
> Why?
>
> Because some process may get the EACCES error half way through it's
> operation.

No. Because the process can receive a reply to the write() syscall that
indicates that the data is safe, but the EKEYEXPIRED error will cause
the data to be lost when the client tries to actually commit the data to
disk.

> Ok, that needs documenting.
>
> So far we seem to have established that the old way of doing things was
> bad because it produced non-posix behaviour and could lead to data
> corruption if a ticket expires while a process needs it.
>
> And the new way is bad because it leaves people puzzling over hung
> workstations in the morning.
>
> The traditional Kerberos/AFS way was to behave the old way, and use
> krenew to keep the ticket from expiring if a process needed to be run
> overnight.

Which is just wrong: the general intention of kerberos security is to
ensure that the _user_ has ACKed an operation. Renewing tickets without
user input would circumvent that intention. If you need to have the job
run overnight, then ask for a longer lifetime for your ticket.

> What other way is there of fixing the problem if we are going to keep
> the "hang 'till a ticket turns up" behaviour? (rewrite gnome and kde
> seems kind of a big job).

Notify the kernel that a ticket is about to expire so that the kernel
can decide to block the process on the next NFS-related syscall.

Trond

--
Trond Myklebust
Linux NFS client maintainer

NetApp
Trond.Myklebust@xxxxxxxxxx
www.netapp.com

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/