Re: [RFC] metadata updates vs. fetches (was Re: [PATCH v4] fs: Fix data race in inode_set_ctime_to_ts)
From: Matthew Wilcox
Date: Sun Nov 24 2024 - 18:53:22 EST
On Sun, Nov 24, 2024 at 02:43:58PM -0800, Linus Torvalds wrote:
> On Sun, 24 Nov 2024 at 14:34, Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote:
> >
> > Could we just do:
> >
> > again:
> > nsec = READ_ONCE(inode->nsec)
> > sec = READ_ONCE(inode->sec)
> > if (READ_ONCE(inode->nsec) != nsec)
> > goto again;
>
> No. You would need to use the right memory ordering barriers.
>
> And make sure the writes are in the right order.
>
> And even then it wouldn't protect against the race in theory, since
> two (separate) time writes could make that nsec check work, even when
> the 'sec' read wouldn't necessarily match *either* of the matching
> nsec cases.
But if we assume that time only goes forwards (ie nobody's calling
utime()), I don't think there's a sequence of updates which let you see
a file time which is newer than the actual time of the file. I tried
to construct an example, and I couldn't. eg:
A: WRITE_ONCE(inode->sec, 5)
A: WRITE_ONCE(inode->nsec, 950)
A: WRITE_ONCE(inode->sec, 6)
B: READ_ONCE(inode->nsec)
B: READ_ONCE(inode->sec)
A: WRITE_ONCE(inode->sec, 170)
A: WRITE_ONCE(inode->sec, 7)
A: WRITE_ONCE(inode->sec, 950)
B: READ_ONCE(inode->nsec)
Now we have a time of 6:950 which is never a time that this file had,
but it's intermediate in time between two times that the file _did_
have, so it won't break make.
Or did I not try hard enough to construct a counterexample that
would break make?
(assume the appropriate read/write barriers are in there)