Re: KCSAN: data-race in fat16_ent_put / fat_search_long

From: Matthew Wilcox
Date: Tue Nov 05 2019 - 10:25:35 EST


On Tue, Nov 05, 2019 at 03:39:23PM +0100, Marco Elver wrote:
> On Tue, 05 Nov 2019, syzbot wrote:
> > ==================================================================
> > BUG: KCSAN: data-race in fat16_ent_put / fat_search_long
> >
> > write to 0xffff8880a209c96a of 2 bytes by task 11985 on cpu 0:
> > fat16_ent_put+0x5b/0x90 fs/fat/fatent.c:181
> > fat_ent_write+0x6d/0xf0 fs/fat/fatent.c:415
> > fat_chain_add+0x34e/0x400 fs/fat/misc.c:130
> > fat_add_cluster+0x92/0xd0 fs/fat/inode.c:112
> > __fat_get_block fs/fat/inode.c:154 [inline]
> > fat_get_block+0x3ae/0x4e0 fs/fat/inode.c:189
> > __block_write_begin_int+0x2ea/0xf20 fs/buffer.c:1968
> > __block_write_begin fs/buffer.c:2018 [inline]
> > block_write_begin+0x77/0x160 fs/buffer.c:2077
> > cont_write_begin+0x3d6/0x670 fs/buffer.c:2426
> > fat_write_begin+0x72/0xc0 fs/fat/inode.c:235
> > pagecache_write_begin+0x6b/0x90 mm/filemap.c:3148
> > cont_expand_zero fs/buffer.c:2353 [inline]
> > cont_write_begin+0x17a/0x670 fs/buffer.c:2416
> > fat_write_begin+0x72/0xc0 fs/fat/inode.c:235
> > pagecache_write_begin+0x6b/0x90 mm/filemap.c:3148
> > generic_cont_expand_simple+0xb0/0x120 fs/buffer.c:2317
> >
> > read to 0xffff8880a209c96b of 1 bytes by task 11990 on cpu 1:
> > fat_search_long+0x20a/0xc60 fs/fat/dir.c:484
> > vfat_find+0xc1/0xd0 fs/fat/namei_vfat.c:698
> > vfat_lookup+0x75/0x350 fs/fat/namei_vfat.c:712
> > lookup_open fs/namei.c:3203 [inline]
> > do_last fs/namei.c:3314 [inline]
> > path_openat+0x15b6/0x36e0 fs/namei.c:3525
> > do_filp_open+0x11e/0x1b0 fs/namei.c:3555
> > do_sys_open+0x3b3/0x4f0 fs/open.c:1097
> > __do_sys_open fs/open.c:1115 [inline]
> > __se_sys_open fs/open.c:1110 [inline]
> > __x64_sys_open+0x55/0x70 fs/open.c:1110
> > do_syscall_64+0xcc/0x370 arch/x86/entry/common.c:290
> > entry_SYSCALL_64_after_hwframe+0x44/0xa9
> >
> > Reported by Kernel Concurrency Sanitizer on:
> > CPU: 1 PID: 11990 Comm: syz-executor.2 Not tainted 5.4.0-rc3+ #0
> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> > Google 01/01/2011
> > ==================================================================
>
> I was trying to understand what is happening here, but fail to see how
> this can happen. So it'd be good if somebody who knows this code can
> explain. We are quite positive this is not a false positive, given the
> addresses accessed match.

Both of these accesses are into a buffer head; ie the data being accessed
is stored in the page cache. Is it possible the page was reused for
different data between these two accesses?

> The two bits of code in question here are:
>
> static void fat16_ent_put(struct fat_entry *fatent, int new)
> {
> if (new == FAT_ENT_EOF)
> new = EOF_FAT16;
>
> *fatent->u.ent16_p = cpu_to_le16(new); <<== data race here
> mark_buffer_dirty_inode(fatent->bhs[0], fatent->fat_inode);
> }
>
> int fat_search_long(struct inode *inode, const unsigned char *name,
> int name_len, struct fat_slot_info *sinfo)
> {
> struct super_block *sb = inode->i_sb;
> struct msdos_sb_info *sbi = MSDOS_SB(sb);
> struct buffer_head *bh = NULL;
> struct msdos_dir_entry *de;
> unsigned char nr_slots;
> wchar_t *unicode = NULL;
> unsigned char bufname[FAT_MAX_SHORT_SIZE];
> loff_t cpos = 0;
> int err, len;
>
> err = -ENOENT;
> while (1) {
> if (fat_get_entry(inode, &cpos, &bh, &de) == -1)
> goto end_of_dir;
> parse_record:
> nr_slots = 0;
> if (de->name[0] == DELETED_FLAG)
> continue;
> if (de->attr != ATTR_EXT && (de->attr & ATTR_VOLUME)) <<== data race here
> continue;
> if (de->attr != ATTR_EXT && IS_FREE(de->name))
> continue;
> <snip>
> }
>
> Thanks,
> -- Marco