Re: [PATCH v2 0/5] mm/shmem: optimize read with reduced xarray lookups and folio batching
From: Chi Zhiling
Date: Mon Jun 01 2026 - 03:17:39 EST
On 6/1/26 13:56, Chi Zhiling wrote:
Performance Results:Apologies, due to my oversight, the tests involving hole were incorrect, the holes were not successfully created in the files during testing.
fio --ioengine=sync --rw=read --bs=$1 --size=1G --runtime=180 --time_based --group_reporting --name=seq_read_test --filename=testfile
| THP disabled in tmpfs | v7.1-rc5 | v7.1-rc5 + fbatch | Improvement |
| ---------------------- | ------------ | ----------------- | ----------- |
| 1M + normal file | bw=11.5GiB/s | bw=12.7GiB/s | +10.4% |
| 64k + normal file | bw=11.0GiB/s | bw=12.3GiB/s | +11.8% |
| 4k + normal file | bw=3826MiB/s | bw=3849MiB/s | +0.6% |
| 1M + fallocated file | bw=23.8GiB/s | bw=28.6GiB/s | +20.2% |
| 64k + fallocated file | bw=22.5GiB/s | bw=27.3GiB/s | +21.3% |
| 4k + fallocated file | bw=4655MiB/s | bw=4680MiB/s | +0.5% |
| 1M + hole | bw=24.2GiB/s | bw=28.6GiB/s | +18.2% |
| 64k + hole | bw=22.6GiB/s | bw=27.6GiB/s | +22.1% |
| 4k + hole | bw=4652MiB/s | bw=4489MiB/s | -3.5% |
| THP enabled in tmpfs | v7.1-rc5 | v7.1-rc5 + fbatch | Improvement |
| --------------------- | ------------ | ----------------- | ----------- |
| 1M + normal file | bw=13.7GiB/s | bw=13.9GiB/s | +1.4% |
| 64k + normal file | bw=13.5GiB/s | bw=13.5GiB/s | +0.0% |
| 4k + normal file | bw=3833MiB/s | bw=3859MiB/s | +0.7% |
| 1M + fallocated file | bw=24.9GiB/s | bw=34.2GiB/s | +37.3% |
| 64k + fallocated file | bw=23.0GiB/s | bw=31.4GiB/s | +36.5% |
| 4k + fallocated file | bw=4710MiB/s | bw=4655MiB/s | -1.2% |
| 1M + hole | bw=24.3GiB/s | bw=34.5GiB/s | +42.0% |
| 64k + hole | bw=23.5GiB/s | bw=31.1GiB/s | +32.3% |
| 4k + hole | bw=4690MiB/s | bw=4647MiB/s | -0.9% |
Below are the corrected results from a retest:
| THP disabled | v7.1-rc5 | v7.1-rc5 + fbatch | Improvement |
| ------------ | ------------ | ----------------- | ----------- |
| 1M + hole | bw=27.3GiB/s | bw=23.4GiB/s | -14.3% |
| 64k + hole | bw=27.3GiB/s | bw=23.3GiB/s | -14.7% |
| 4k + hole | bw=4825MiB/s | bw=4624MiB/s | -4.2% |
| THP enabled | v7.1-rc5 | v7.1-rc5 + fbatch | Improvement |
| ----------- | ------------ | ----------------- | ----------- |
| 1M + hole | bw=27.0GiB/s | bw=23.1GiB/s | -14.4% |
| 64k + hole | bw=27.5GiB/s | bw=23.3GiB/s | -15.3% |
| 4k + hole | bw=4777MiB/s | bw=4640MiB/s | -2.9% |
There is a noticeable performance drop when accessing holes, as every read triggers a fallback. I will address this in the next version.