Re: [PATCH v5 00/22] Rewrite XIP code and add XIP support to ext4
From: Dave Chinner
Date: Thu Jan 30 2014 - 01:42:45 EST
On Wed, Jan 15, 2014 at 08:24:18PM -0500, Matthew Wilcox wrote:
> This series of patches add support for XIP to ext4. Unfortunately,
> it turns out to be necessary to rewrite the existing XIP support code
> first due to races that are unfixable in the current design.
>
> Since v4 of this patchset, I've improved the documentation, fixed a
> couple of warnings that a newer version of gcc emitted, and fixed a
> bug where we would read/write the wrong address for I/Os that were not
> aligned to PAGE_SIZE.
Looks like there's something fundamentally broken with the patch set
as it stands. I get this same data corruption on both ext4 and XFS
with XIP using fsx. It's as basic as it gets - the first read after
a mmapped write fails to see the data written by mmap:
$ sudo mkfs.xfs -f /dev/ram0
meta-data=/dev/ram0 isize=256 agcount=4, agsize=256000 blks
= sectsz=512 attr=2, projid32bit=1
= crc=0
data = bsize=4096 blocks=1024000, imaxpct=25
= sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0 ftype=0
log =internal log bsize=4096 blocks=12800, version=2
= sectsz=512 sunit=0 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
$ sudo mount -o xip /dev/ram0 /mnt/scr
$ sudo chmod 777 /mnt/scr
$ ltp/fsx -d -N 1000 -S 0 /mnt/scr/fsx
Seed set to 3774
1 mapwrite 0x3db39 thru 0x3ffff (0x24c7 bytes)
2 mapread 0x2e947 thru 0x33163 (0x481d bytes)
3 read 0x2e836 thru 0x3cba1 (0xe36c bytes)
4 punch from 0x2e7 to 0x5c43, (0x595c bytes)
5 mapwrite 0xcaea thru 0x13ba9 (0x70c0 bytes)
6 punch from 0x31645 to 0x38d1d, (0x76d8 bytes)
7 falloc from 0x24f92 to 0x2f2b7 (0xa325 bytes)
fallocating to largest ever: 0x171ac
8 falloc from 0xbcf1 to 0x171ac (0xb4bb bytes)
9 read 0x126f thru 0x11136 (0xfec8 bytes)
READ BAD DATA: offset = 0x126f, size = 0xfec8, fname = /mnt/scr/fsx
OFFSET GOOD BAD RANGE
0x caea 0x05f9 0x0000 0x 0
operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops
0x caeb 0xf905 0x0000 0x 1
operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops
0x caec 0x0599 0x0000 0x 2
operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops
0x caed 0x9905 0x0000 0x 3
operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops
0x caee 0x05e8 0x0000 0x 4
operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops
0x caef 0xe805 0x0000 0x 5
operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops
0x caf0 0x0580 0x0000 0x 6
operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops
0x caf1 0x8005 0x0000 0x 7
operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops
0x caf2 0x056c 0x0000 0x 8
operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops
0x caf3 0x6c05 0x0000 0x 9
operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops
0x caf4 0x05ad 0x0000 0x a
operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops
0x caf5 0xad05 0x0000 0x b
operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops
0x caf6 0x0539 0x0000 0x c
operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops
0x caf7 0x3905 0x0000 0x d
operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops
0x caf8 0x05db 0x0000 0x e
operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops
0x caf9 0xdb05 0x0000 0x f
operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops
LOG DUMP (9 total operations):
1( 1 mod 256): MAPWRITE 0x3db39 thru 0x3ffff (0x24c7 bytes)
2( 2 mod 256): MAPREAD 0x2e947 thru 0x33163 (0x481d bytes)
3( 3 mod 256): READ 0x2e836 thru 0x3cba1 (0xe36c bytes)
4( 4 mod 256): PUNCH 0x2e7 thru 0x5c42 (0x595c bytes)
5( 5 mod 256): MAPWRITE 0xcaea thru 0x13ba9 (0x70c0 bytes) ******WWWW
6( 6 mod 256): PUNCH 0x31645 thru 0x38d1c (0x76d8 bytes)
7( 7 mod 256): FALLOC 0x24f92 thru 0x2f2b7 (0xa325 bytes) INTERIOR
8( 8 mod 256): FALLOC 0xbcf1 thru 0x171ac (0xb4bb bytes) INTERIOR ******FFFF
9( 9 mod 256): READ 0x126f thru 0x11136 (0xfec8 bytes) ***RRRR***
Correct content saved for comparison
(maybe hexdump "/mnt/scr/fsx" vs "/mnt/scr/fsx.fsxgood")
XFS gives a good indication that we aren't doing something correctly
w.r.t. mapped XIP writes, as trying to fiemap the file ASSERT fails
with a delayed allocation extent somewhere inside the file after a
sync. I shall keep digging.
Cheers,
Dave.
--
Dave Chinner
david@xxxxxxxxxxxxx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/