Tejun Heo wrote:
Pierre Ossman wrote:
After testing this it seems the block layer never gives me more than
max_hw_segs segments. Is it being clever because I'm compiling for a
system without an IOMMU?
The hardware should (haven't properly tested this) be able to get new
DMA addresses during a transfer. In essence scatter gather with some CPU
support. Since I avoid MMC overhead this should give a nice performance
boost. But this relies on the block layer giving me more than one
segment. Do I need to lie in max_hw_segs to achieve this?
Hi, Pierre.
max_phys_segments: the maximum number of segments in a request
*before* DMA mapping
max_hw_segments: the maximum number of segments in a request
*after* DMA mapping (ie. after IOMMU merging)
Those maximum numbers are for block layer. Block layer must not
exceed above limits when it passes a request downward. As long as all
entries in sg are processed, block layer doesn't care whether sg
iteration is performed by the driver or hardware.
So, if you're gonna perform sg by iterating in the driver, what
numbers to report for max_phys_segments and max_hw_segments is
entirely upto how many entries the driver can handle.
Just report some nice number (64 or 128?) for both. Don't forget that
the number of sg entries can be decreased after DMA-mapping on
machines with IOMMU.
IOW, the part which performs sg iteration gets to determine above
limits. In your case, the driver is reponsible for both iterations
(pre and post DMA mapping), so all the limits are upto the driver.
I'm still a bit confused why the block layer needs to know the maximum
number of hw segments. Different hardware might be connected to
different IOMMU:s, so only the driver will now how much the number can
be reduced. So the block layer should only care about not going above
max_phys_segments, since that's what the driver has room for.
What is the scenario that requires both?