Re: KCSAN: data-race in fat16_ent_put / fat_search_long

From: OGAWA Hirofumi
Date: Wed Nov 06 2019 - 03:39:44 EST


Matthew Wilcox <willy@xxxxxxxxxxxxx> writes:

> On Tue, Nov 05, 2019 at 03:39:23PM +0100, Marco Elver wrote:
>> On Tue, 05 Nov 2019, syzbot wrote:
>> > ==================================================================
>> > BUG: KCSAN: data-race in fat16_ent_put / fat_search_long
>> >
>> > write to 0xffff8880a209c96a of 2 bytes by task 11985 on cpu 0:
>> > fat16_ent_put+0x5b/0x90 fs/fat/fatent.c:181
>> > fat_ent_write+0x6d/0xf0 fs/fat/fatent.c:415
>> > fat_chain_add+0x34e/0x400 fs/fat/misc.c:130
>> > fat_add_cluster+0x92/0xd0 fs/fat/inode.c:112
>> > __fat_get_block fs/fat/inode.c:154 [inline]
>> > fat_get_block+0x3ae/0x4e0 fs/fat/inode.c:189
>> > __block_write_begin_int+0x2ea/0xf20 fs/buffer.c:1968
>> > __block_write_begin fs/buffer.c:2018 [inline]
>> > block_write_begin+0x77/0x160 fs/buffer.c:2077
>> > cont_write_begin+0x3d6/0x670 fs/buffer.c:2426
>> > fat_write_begin+0x72/0xc0 fs/fat/inode.c:235
>> > pagecache_write_begin+0x6b/0x90 mm/filemap.c:3148
>> > cont_expand_zero fs/buffer.c:2353 [inline]
>> > cont_write_begin+0x17a/0x670 fs/buffer.c:2416
>> > fat_write_begin+0x72/0xc0 fs/fat/inode.c:235
>> > pagecache_write_begin+0x6b/0x90 mm/filemap.c:3148
>> > generic_cont_expand_simple+0xb0/0x120 fs/buffer.c:2317
>> >
>> > read to 0xffff8880a209c96b of 1 bytes by task 11990 on cpu 1:
>> > fat_search_long+0x20a/0xc60 fs/fat/dir.c:484
>> > vfat_find+0xc1/0xd0 fs/fat/namei_vfat.c:698
>> > vfat_lookup+0x75/0x350 fs/fat/namei_vfat.c:712
>> > lookup_open fs/namei.c:3203 [inline]
>> > do_last fs/namei.c:3314 [inline]
>> > path_openat+0x15b6/0x36e0 fs/namei.c:3525
>> > do_filp_open+0x11e/0x1b0 fs/namei.c:3555
>> > do_sys_open+0x3b3/0x4f0 fs/open.c:1097
>> > __do_sys_open fs/open.c:1115 [inline]
>> > __se_sys_open fs/open.c:1110 [inline]
>> > __x64_sys_open+0x55/0x70 fs/open.c:1110
>> > do_syscall_64+0xcc/0x370 arch/x86/entry/common.c:290
>> > entry_SYSCALL_64_after_hwframe+0x44/0xa9
>> >
>> > Reported by Kernel Concurrency Sanitizer on:
>> > CPU: 1 PID: 11990 Comm: syz-executor.2 Not tainted 5.4.0-rc3+ #0
>> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
>> > Google 01/01/2011
>> > ==================================================================
>>
>> I was trying to understand what is happening here, but fail to see how
>> this can happen. So it'd be good if somebody who knows this code can
>> explain. We are quite positive this is not a false positive, given the
>> addresses accessed match.
>
> Both of these accesses are into a buffer head; ie the data being accessed
> is stored in the page cache. Is it possible the page was reused for
> different data between these two accesses?

No and yes. Reader side is directory buffer, writer side is FAT buffer.
So FAT buffer never be reused as directory buffer. But the page cache
itself can be freed and reused as different index. So if KCSAN can't
detect the page cache recycle, it would be possible.

Is there anyway to know "why KCSAN thought this as data race"?

>> The two bits of code in question here are:
>>
>> static void fat16_ent_put(struct fat_entry *fatent, int new)
>> {
>> if (new == FAT_ENT_EOF)
>> new = EOF_FAT16;
>>
>> *fatent->u.ent16_p = cpu_to_le16(new); <<== data race here
>> mark_buffer_dirty_inode(fatent->bhs[0], fatent->fat_inode);
>> }

This is updating FAT entry (index for data cluster placement) on FAT buffer.

>> int fat_search_long(struct inode *inode, const unsigned char *name,
>> int name_len, struct fat_slot_info *sinfo)
>> {
>> struct super_block *sb = inode->i_sb;
>> struct msdos_sb_info *sbi = MSDOS_SB(sb);
>> struct buffer_head *bh = NULL;
>> struct msdos_dir_entry *de;
>> unsigned char nr_slots;
>> wchar_t *unicode = NULL;
>> unsigned char bufname[FAT_MAX_SHORT_SIZE];
>> loff_t cpos = 0;
>> int err, len;
>>
>> err = -ENOENT;
>> while (1) {
>> if (fat_get_entry(inode, &cpos, &bh, &de) == -1)
>> goto end_of_dir;
>> parse_record:
>> nr_slots = 0;
>> if (de->name[0] == DELETED_FLAG)
>> continue;
>> if (de->attr != ATTR_EXT && (de->attr & ATTR_VOLUME)) <<== data race here

Checking attribute on directory buffer.

>> continue;
>> if (de->attr != ATTR_EXT && IS_FREE(de->name))
>> continue;
>> <snip>
>> }
>>
>> Thanks,
>> -- Marco

--
OGAWA Hirofumi <hirofumi@xxxxxxxxxxxxxxxxxx>