Re: [Bug 214873] New: man 2 fsync implies possibility to return early

From: Jens Axboe
Date: Sat Oct 30 2021 - 11:17:20 EST


On 10/30/21 6:05 AM, Alejandro Colomar (man-pages) wrote:
> [CC += LKML and a few kernel programmers]
>
> Hi,
>
> On 10/29/21 23:25, bugzilla-daemon@xxxxxxxxxxxxxxxxxxx wrote:
>> https://bugzilla.kernel.org/show_bug.cgi?id=214873
>>
>> Bug ID: 214873
>> Summary: man 2 fsync implies possibility to return early
>> Product: Documentation
>> Version: unspecified
>> Hardware: All
>> OS: Linux
>> Status: NEW
>> Severity: low
>> Priority: P1
>> Component: man-pages
>> Assignee: documentation_man-pages@xxxxxxxxxxxxxxxxxxxx
>> Reporter: sworddragon2@xxxxxxxxx
>> Regression: No
>>
>> The manpage for the fsync system call (
>> https://man7.org/linux/man-pages/man2/fsync.2.html ) describes as flushing the
>> related caches to a storage device so that the information can even be
>> retrieved after a crash/reboot. But then it does make the statement "The call
>> blocks until the device reports that the transfer has completed." which causes
>> now some interpretation: What happens if the device reports early completion
>> (e.g. via a bugged firmware) of the transfer while the kernel still sees unsent
>> caches in its context? Does fsync() indeed return then as the last referenced
>> sentence implies or does it continue to send the caches the kernel sees to
>> guarantee data integrity as good as possible as the previous documented part
>> might imply?
>>
>> I noticed this discrepancy when reporting a bug against dd (
>> https://debbugs.gnu.org/cgi/bugreport.cgi?bug=51345 ) that causes dd to return
>> early when it is used with its fsync capability while the kernel still sees
>> caches and consulting the fsync() manpage made it not clear if such a
>> theoretical possibility from the fsync() system call would be intended or not
>> so eventually this part could be slighty enhanced.
>>
>
> I don't know how fsync(2) works. Could some kernel fs programmer please
> check if the text matches the implementation, and if that issue reported
> should be reworded in the manual page?

I don't know what the "see caches" mean in a few spots in the above
text? In simplified terms, fsync will write out dirty data and then
ensure that it is stable on media. The latter is your cache flush, if
the underlying device is using some sort of writeback caching. When the
flush is issued, there is no more dirty kernel cached data.

If the device doesn't honor a cache flush (eg "all writes previously
acked are now stable"), then there's nothing the kernel can do about it.
It would not even know. The only way to know is if a powercut comes in
after a flush, and once power is restored, the media contains stale
data.

There is no issue here. If your storage device is lying to you, buy
better storage devices.

--
Jens Axboe