Re: [GIT PULL] Please pull NFS client changes for Linux 4.13

From: Linus Torvalds
Date: Tue Aug 01 2017 - 13:20:38 EST

On Tue, Aug 1, 2017 at 8:51 AM, davej@xxxxxxxxxxxxxxxxx
<davej@xxxxxxxxxxxxxxxxx> wrote:
> On Mon, Jul 31, 2017 at 10:35:45PM -0700, Linus Torvalds wrote:
> > Any chance of getting the output from
> >
> > ./scripts/faddr2line vmlinux nfs4_exchange_id_done+0x3d7/0x8e0
> Hm, that points to this..
> 7463 /* Save the EXCHANGE_ID verifier session trunk tests */
> 7464 memcpy(clp->, cdata->args.verifier->data,
> 7465 sizeof(clp->;

Ok, that certainly made no sense to me, because the KASAN report made
it look like a stale pathname access (allocated in getname, freed in
putname), but I think the issue is more fundamental than that.

That cdata->args.verifier seems to be entirely broken. AT least for
the "xprt == NULL" case, it does the following:

- use the address of a local variable ("&verifier")

- wait for the rpc completion using rpc_wait_for_completion_task().

That's unacceptably buggy crap. rpc_wait_for_completion_task() will
happily exit on a deadly signal even if the rpc hasn't been completed,
so now you'll have a stale pointer to a stack that has been freed.

So I think the 'pathname' part may actually be entirely a red herring,
and it's the underlying access itself that just picks up a random
pointer from a stack that now contains something different. And KASAN
didn't notice the stale stack access itself, because the stack slot is
still valid - it's just no longer the original 'verifier' allocation.

Or *something* like that.

None of this looks even remotely new, though - the code seems to go
back to 2009. Have you just changed what you're testing to trigger
these things?

I'm not even sure why it does that stupid stack allocation. It does a
*real* allocation just a few lines later:

struct nfs41_exchange_id_data *calldata
calldata = kzalloc(sizeof(*calldata), GFP_NOFS);

and the whole verifier structure could easily have been part of that
same allocation as far as I can tell.

And that really might seem to be the right thing to do.


That patch compiles for me. It *might* even work. Or it might just be
the ramblings of a diseased mind.

Anna? Trond?

So caveat probatorem,

fs/nfs/nfs4proc.c | 19 ++++++++++++++-----
1 file changed, 14 insertions(+), 5 deletions(-)

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 18ca6879d8de..0712af3d38f8 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -7490,6 +7490,11 @@ static const struct rpc_call_ops nfs4_exchange_id_call_ops = {
.rpc_release = nfs4_exchange_id_release,

+struct verifier_and_calldata {
+ struct nfs41_exchange_id_data calldata;
+ nfs4_verifier verifier;
* _nfs4_proc_exchange_id()
@@ -7498,7 +7503,8 @@ static const struct rpc_call_ops nfs4_exchange_id_call_ops = {
static int _nfs4_proc_exchange_id(struct nfs_client *clp, struct rpc_cred *cred,
u32 sp4_how, struct rpc_xprt *xprt)
- nfs4_verifier verifier;
+ struct verifier_and_calldata *vna;
+ nfs4_verifier *verifier;
struct rpc_message msg = {
.rpc_proc = &nfs4_procedures[NFSPROC4_CLNT_EXCHANGE_ID],
.rpc_cred = cred,
@@ -7516,14 +7522,17 @@ static int _nfs4_proc_exchange_id(struct nfs_client *clp, struct rpc_cred *cred,
if (!atomic_inc_not_zero(&clp->cl_count))
return -EIO;

- calldata = kzalloc(sizeof(*calldata), GFP_NOFS);
- if (!calldata) {
+ vna = kzalloc(sizeof(*vna), GFP_NOFS);
+ if (!vna) {
return -ENOMEM;
+ /* kfree() of calldata will also free the verifier */
+ calldata = &vna->calldata;
+ verifier = &vna->verifier;

if (!xprt)
- nfs4_init_boot_verifier(clp, &verifier);
+ nfs4_init_boot_verifier(clp, verifier);

status = nfs4_init_uniform_client_string(clp);
if (status)
@@ -7566,7 +7575,7 @@ static int _nfs4_proc_exchange_id(struct nfs_client *clp, struct rpc_cred *cred,
calldata->args.verifier = &clp->cl_confirm;
} else {
- calldata->args.verifier = &verifier;
+ calldata->args.verifier = verifier;
calldata->args.client = clp;