Re: [GIT PULL] eCryptfs fixes for 3.2-rc3

From: Linus Torvalds
Date: Wed Nov 23 2011 - 17:56:46 EST


On Wed, Nov 23, 2011 at 2:25 PM, Tyler Hicks <tyhicks@xxxxxxxxxxxxx> wrote:
>
> Tyler Hicks (3):
>      eCryptfs: Flush file in vma close

I'm not hugely happy with this one.

The commit message says:

Dirty pages weren't being written back when an mmap'ed eCryptfs file was
closed before the mapping was unmapped. Since f_ops->flush() is not
called by the munmap() path, the lower file was simply being released.
This patch flushes the eCryptfs file in the vm_ops->close() path.

Fair enough - you've debugged the problem. You're misusing the ->flush
thing which only gets called at close time, rather than flushing
things at ->release time. But why?

"->flush()" is very special, and is literally meant for things that
need to wait at close time. A file descriptor may be flushed many
times for a single open (because it was dup'ed etc), and yes, if it is
closed before mmap, it will be flushed before the mmap is done. The
"flush()" is basically attached to a particular fd - useful mainly for
things like special devices that actually want to delay the close
(serial lines etc).

But the fundamental issue is that I don't think cryptfs should be
using "flush". The file is still *open* when "flush()" is called.
That's the fundamental reason for the bug, I think.

cryptfs should flush the encrypted information at *release* time, not
"flush" time. And that would have avoided the bug with mmap, because
release gets called on the very last internal reference count drop of
the 'struct file' - so it gets called after the last close *and*
munmap.

Is there some reason I am missing that cryptfs has to use flush?

I'm doing the pull, but I really think that this is papering over the
*real* bug, which was the use of 'flush' in the first place.

In general, I'd urge people to *not* use "->flush" at all as a
"correctness issue". It's useful to return EIO to "close()" and to be
*polite* (ie the return value of "flush()" will be returned to user
space at close time), but it really should be seen as a "we try to
flush now to try to give user space nice error reports where
possible", but it's important to understand that it's not the last
close, and if you rely on it for correctness, you're doing something
wrong. It's "release()" that is the "get rid of all your state now",
and is about correctness. "flush" is purely about being polite.

ecryptfs seems to have relied on it for correctness. Not good.

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/