Re: [PATCH v2] f2fs: avoid readahead race condition

From: Eric Biggers
Date: Mon Jun 29 2020 - 17:30:37 EST


On Mon, Jun 29, 2020 at 11:24:14AM -0700, Jaegeuk Kim wrote:
> On 06/29, Eric Biggers wrote:
> > On Mon, Jun 29, 2020 at 08:03:23AM -0700, Jaegeuk Kim wrote:
> > > If two readahead threads having same offset enter in readpages, every read
> > > IOs are split and issued to the disk which giving lower bandwidth.
> > >
> > > This patch tries to avoid redundant readahead calls.
> > >
> > > Signed-off-by: Jaegeuk Kim <jaegeuk@xxxxxxxxxx>
> > > ---
> > > v2:
> > > - add missing code to bypass read
> > >
> > > fs/f2fs/data.c | 18 +++++++++++++++++-
> > > fs/f2fs/f2fs.h | 1 +
> > > fs/f2fs/super.c | 2 ++
> > > 3 files changed, 20 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
> > > index d6094b9f3916..9b69a159cc6c 100644
> > > --- a/fs/f2fs/data.c
> > > +++ b/fs/f2fs/data.c
> > > @@ -2403,6 +2403,7 @@ int f2fs_mpage_readpages(struct address_space *mapping,
> > > #endif
> > > unsigned max_nr_pages = nr_pages;
> > > int ret = 0;
> > > + bool drop_ra = false;
> > >
> > > map.m_pblk = 0;
> > > map.m_lblk = 0;
> > > @@ -2413,13 +2414,25 @@ int f2fs_mpage_readpages(struct address_space *mapping,
> > > map.m_seg_type = NO_CHECK_TYPE;
> > > map.m_may_create = false;
> > >
> > > + /*
> > > + * Two readahead threads for same address range can cause race condition
> > > + * which fragments sequential read IOs. So let's avoid each other.
> > > + */
> > > + if (pages && is_readahead) {
> > > + page = list_last_entry(pages, struct page, lru);
> > > + if (F2FS_I(inode)->ra_offset == page_index(page))
> > > + drop_ra = true;
> > > + else
> > > + F2FS_I(inode)->ra_offset = page_index(page);
> > > + }
> >
> > This is a data race because ra_offset can be read/written by different threads
> > concurrently.
> >
> > It either needs locking, or READ_ONCE() and WRITE_ONCE() if races are okay.
>
> I just wanted to keep zero overhead, since it doesn't matter either cases of
> skipping readahead or not.
>

Okay, then it should use READ_ONCE() and WRITE_ONCE().

- Eric