Re: KCSAN: data-race in __alloc_file / __alloc_file

From: Linus Torvalds
Date: Mon Nov 11 2019 - 18:51:26 EST


On Mon, Nov 11, 2019 at 11:00 AM Linus Torvalds
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>
> > if (ppos) {
> > pos = *ppos; // data-race
>
> That code uses "fdget_pos().
>
> Which does mutual exclusion _if_ the file is something we care about
> pos for, and if it has more than one process using it.

That said, the more I look at that code, the less I like it.

I have this feeling we really should get rid of FMODE_ATOMIC_POS
entirely, now that we have the much nicer FMODE_STREAM to indicate
that 'pos' really doesn't matter.

Also, the test for "file_count(file) > 1" really is wrong, in that it
means that we protect against other processes, but not other threads.

So maybe we really should do the attached thing. Adding Al and Kirill
to the cc for comments. Kirill did some fairly in-depth review of the
whole locking on f_pos, it might be good to get his comments.

Al? Note the change from

- if (file_count(file) > 1) {
+ if ((v & FDPUT_FPUT) || file_count(file) > 1) {

in __fdget_pos(). It basically says that the threaded case also does
the pos locking.

NOTE! This is entirely untested. It might be totally broken. It passes
my "LooksSuperficiallyFine(tm)" test, but that's all I'm going to say
about the patch.

Linus
fs/file.c | 4 ++--
fs/open.c | 6 +-----
include/linux/fs.h | 2 --
3 files changed, 3 insertions(+), 9 deletions(-)

diff --git a/fs/file.c b/fs/file.c
index 3da91a112bab..708e5c2b7d65 100644
--- a/fs/file.c
+++ b/fs/file.c
@@ -795,8 +795,8 @@ unsigned long __fdget_pos(unsigned int fd)
unsigned long v = __fdget(fd);
struct file *file = (struct file *)(v & ~3);

- if (file && (file->f_mode & FMODE_ATOMIC_POS)) {
- if (file_count(file) > 1) {
+ if (file && !(file->f_mode & FMODE_STREAM)) {
+ if ((v & FDPUT_FPUT) || file_count(file) > 1) {
v |= FDPUT_POS_UNLOCK;
mutex_lock(&file->f_pos_lock);
}
diff --git a/fs/open.c b/fs/open.c
index b62f5c0923a8..5c68282ea79e 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -771,10 +771,6 @@ static int do_dentry_open(struct file *f,
f->f_mode |= FMODE_WRITER;
}

- /* POSIX.1-2008/SUSv4 Section XSI 2.9.7 */
- if (S_ISREG(inode->i_mode) || S_ISDIR(inode->i_mode))
- f->f_mode |= FMODE_ATOMIC_POS;
-
f->f_op = fops_get(inode->i_fop);
if (WARN_ON(!f->f_op)) {
error = -ENODEV;
@@ -1256,7 +1252,7 @@ EXPORT_SYMBOL(nonseekable_open);
*/
int stream_open(struct inode *inode, struct file *filp)
{
- filp->f_mode &= ~(FMODE_LSEEK | FMODE_PREAD | FMODE_PWRITE | FMODE_ATOMIC_POS);
+ filp->f_mode &= ~(FMODE_LSEEK | FMODE_PREAD | FMODE_PWRITE);
filp->f_mode |= FMODE_STREAM;
return 0;
}
diff --git a/include/linux/fs.h b/include/linux/fs.h
index e0d909d35763..a7c3f6dd5701 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -148,8 +148,6 @@ typedef int (dio_iodone_t)(struct kiocb *iocb, loff_t offset,
/* File is opened with O_PATH; almost nothing can be done with it */
#define FMODE_PATH ((__force fmode_t)0x4000)

-/* File needs atomic accesses to f_pos */
-#define FMODE_ATOMIC_POS ((__force fmode_t)0x8000)
/* Write access to underlying fs */
#define FMODE_WRITER ((__force fmode_t)0x10000)
/* Has read method(s) */