Hello!Thank you for your reply!
The problem is with a one-master-twoslave MYSQL database with three
On Sat 02-12-23 17:14:30, Baokun Li wrote:
Recently, while running some pressure tests on MYSQL, noticed thatYes, combining buffered reads with direct IO writes is a recipe for
occasionally a "corrupted data in log event" error would be reported.
After analyzing the error, I found that extending DIO write and buffered
read were competing, resulting in some zero-filled page end being read.
Since ext4 buffered read doesn't hold an inode lock, and there is no
field in the page to indicate the valid data size, it seems to me that
it is impossible to solve this problem perfectly without changing these
two things.
problems and pretty much in the "don't do it" territory. So honestly I'd
consider this a MYSQL bug. Were you able to identify why does MYSQL use
buffered read in this case? It is just something specific to the test
you're doing?
I also feel that a scenario like buffered reads + DIO writes is strange. ButIn this series, the first patch reads the inode size twice, and takes theNo, there are no plans to address this AFAIK. Because such locking will
smaller of the two values as the copyout limit to avoid copying data that
was not actually read (0-padding) into the user buffer and causing data
corruption. This greatly reduces the probability of problems under 4k
page. However, the problem is still easily triggered under 64k page.
The second patch waits for the existing dio write to complete and
invalidate the stale page cache before performing a new buffered read
in ext4, avoiding data corruption by copying the stale page cache to
the user buffer. This makes it much less likely that the problem will
be triggered in a 64k page.
Do we have a plan to add a lock to the ext4 buffered read or a field in
the page that indicates the size of the valid data in the page? Or does
anyone have a better idea?
slow down all the well behaved applications to fix a corner case for
application doing unsupported things. Sure we must not crash the kernel,
corrupt the filesystem or leak sensitive (e.g. uninitialized) data if app
combines buffered and direct IO but returning zeros instead of valid data
is in my opinion fully within the range of acceptable behavior for such
case.
Honza