the attached patch (against 2.5.1-final) includes the next round of RAID-1
improvements. First it completes the raid1.c cleanups i planned, and it
also adds a number of new RAID-1 performance features.
Changelog:
- cleaned up the resync engine. It got much simpler and easier to
maintain, while still saturating the disks. Resync doesnt get stuck
under heavy load anymore. (this code can be switched to use explicit IO
barrier requests in the future.)
- rewrote the read balancing code to use three estimators: a per-array
'next expected sequential IO' position, plus an IRQ-driven 'estimated
disk head' position. The head position is now updated from all the IO
completion routines: end of READ, end of WRITE, end of resync-READ, end
of resync-WRITE. I've added per-disk tracking of pending requests,
and the read balancer now detects idle disks and utilizes them before
trying to read-balance between busy disks. I've also removed the
sector_count limit that artificially switched the current disk. These
changes make read balancing more accurate and more effective.
- the old raid1 code used to have a limitation: it has always read from
the first disk until the resync finished. Now the code will
read-balance READ requests up to the resync boundary. This should
further improve performance during resyncs.
- added the 'idle IO resync' feature which we used to have in the 2.2
patches, but via a different implementation that does not touch the
generic block IO code. Resync happens only when there is no normal IO
pending on the array. This feature should make resync a more seemless
operation. Resync behavior can be tuned via the speed_limit_min and
speed_limit_max sysctl tunables. Default for the minimum resync speed
is 500 KB/sec, the maximum is 200 MB/sec.
- fixed a number of sector_t <=> unsigned long bugs still left.
despite these new features added, the patch makes raid1.c 8% smaller, so
it's a win-win situation :-) I've tested the patch on UP and SMP as well,
and it's working just fine for me both in degraded-mode and normal mode,
but the usual warnings (do not use on production system, etc.) apply.
Comments, reports, suggestions welcome,
Ingo
This archive was generated by hypermail 2b29 : Sun Dec 23 2001 - 21:00:14 EST