Re: [PATCH 0/2] ext4: fix an data corruption issue in nojournal mode

From: Gao Xiang

Date: Mon Oct 06 2025 - 10:37:23 EST

Hi Jan,

On 2025/10/6 21:52, Jan Kara wrote:

Hi Ted!

I think this patch series has fallen through the cracks. Can you please
push it to Linus? Given there are real users hitting the data corruption,
we should do it soon (although it isn't a new issue so it isn't
supercritical).

Thanks for the ping.

Some of our internal businesses actually rely on EXT4
no_journal mode and when they upgrade the kernel from
4.19 to 5.10, they actually read corrupted data after
page cache memory is reclaimed (actually the on-disk
data was corrupted even earlier).

So personally I wonder what's the current status of
EXT4 no_journal mode since this issue has been existing
for more than 5 years but some people may need
an extent-enabled ext2 so they selected this mode.

The nojournal mode is fully supported. There are many enterprise customers
(mostly cloud vendors) that depend on it. Including Ted's employer ;)

.. yet honestly, this issue can be easily observed in
no_journal + memory pressure, and our new 5.10 kernel
setup (previous 4.19) can catch this issue very easily.

Unless the memory is sufficient, the valid page cache can
cover up this issue, but the on-disk data could be still
corrupted.

So we wonder how large scale no_journal mode is used for
now, and if they have memory pressure workload.

We already released an announcement to advise customers
not using no_journal mode because it seems lack of
enough maintainence (yet many end users are interested
in this mode):
https://www.alibabacloud.com/help/en/alinux/support/data-corruption-risk-and-solution-in-ext4-nojounral-mode

Well, it's good to be cautious but the reality is that data corruption
issues do happen from time to time. Both in nojournal mode and in normal
journalled mode. And this one exists since the beginning when nojournal
mode was implemented. So it apparently requires rather specific conditions
to hit.

The original issue (the one fixed by Yi in 2019) existed
for a quite long time and I think it was hard to reproduce
(compared to this one), but the regression out of lack of
clean_bdev_aliases() and clean_bdev_bh_alias() makes another
serious regression (which exists since 2019 until now) which
can be easily reproduced on some specific VM setup (our
workload is also create and delete some small and big files,
and data corruption can be observed since some data is filled
with extent layout, much like the previous AWS one).

Thanks,
Gao Xiang

Honza