Re: [PATCH] nfsd: fix memory corruption caused by readdir

From: J. Bruce Fields
Date: Tue Mar 05 2019 - 16:42:15 EST


On Tue, Mar 05, 2019 at 10:48:45AM +1100, NeilBrown wrote:
> On Mon, Mar 04 2019, J. Bruce Fields wrote:
>
> > On Mon, Mar 04, 2019 at 02:08:22PM +1100, NeilBrown wrote:
> >> (Note that the commit hash in the Fixes tag is from the 'history'
> >> tree - this bug predates git).
> >> Fixes: eb229d253e6c ("[PATCH] kNFSd: fix two xdr-encode bugs for readdirplus reply")
> >
> > It'd be nice to provide a URL for that. The one I originally cloned one
> > seems to have disappeared.
>
> Fixes-URL: https://git.kernel.org/pub/scm/linux/kernel/git/history/history.git/commit/?id=eb229d253e6c
>
> Though on reflection, that didn't introduce the bug, it just failed to
> fix it properly. It should be:
>
> Fixes: 0b1d57cf7654 ("[PATCH] kNFSd: Fix nfs3 dentry encoding")
> Fixes-URL: https://git.kernel.org/pub/scm/linux/kernel/git/history/history.git/commit/?id=0b1d57cf7654

Oh, so we can blame Olaf. Even better.

> > And how did it go undetected so long, and what caused it to surface just
> > now?
>
> I suspect two different things need to come together to trigger the bug.
> 1/ a directory needs to have filename lengths which cause the xdr
> encoding of the readdirplus reply to place the offset across a page
> boundary.
> A typical entry is around 200 bytes, or 50 quads, so there should be
> a 1:50 chance of hitting that, assuming name lengths are evenly
> distributed (which they aren't).
> In the case which triggered the bug, all file names were 43 bytes,
> all filehandles 28 bytes. This means 192 bytes per entry.
> 21 entries fit in a page leaving 64 bytes. This puts the cookie
> on the page boundary.
>
> 2/ The *next* entry after the one that crosses the page boundary doesn't
> fit. In the cases which triggered, the requested size was 0x1110
> (4368).
> That is enough room for 21 entries, but not for 22.
>
> So presumably the client doesn't run Linux - which always asks
> for 4096 bytes of directory entry (from a Linux server).
> I have no idea what clients the customer was using, but these clients
> seem to have a fairly good chance of triggering the bug (when configured
> like the customer configured them - maybe).

Thanks for the explanation!

> > I once thought about converting this over to the xdr_stream api that
> > NFSv4 uses to hide the page-crossing logic now. But I think it's better
> > to leave it alone.
>
> I agree - the code isn't being actively developed, so stability wins
> over elegance.
>
>
> BTW, the readdir (non-plus) code doesn't really need fixing.
> nfs3svc_decode_readdirargs() caps the ->count at PAGE_SIZE, so the cookie
> can never cross pages. nfs3svc_decode_readdirplusargs() caps it
> at max_blocksize. So if you feel like leaving that part of the change
> out, I probably wouldn't complain.

Eh, makes sense to me to fix it.

--b.