Re: [PATCH] Configure IDE probe delays

From: Mark Lord
Date: Wed Sep 01 2004 - 10:44:16 EST


> Doing this is either pointless or impossible on newer SATA controllers.
> Most are memory-mapped I/O not PIO, where the high-order bits of the
> ATA taskfile are accessed due to an extended register size, not
> "double-pumping" a FIFO.
>
> Even-newer SATA controllers are FIS-based rather than taskfile-based, so
> you pass it a FIS (containing all the registers) unconditionally.

PCI accesses are not free, so anything that avoids having to
go over the PCI bus is a worthwhile optimization.

The processor buses run at 200-800Mhz or so, whereas PCI is normally
only clocking at 33Mhz, sometimes at 66Mhz.

With good ADMA or host-queuing controllers that access system
memory directly for their command blocks, then there's not much
(if any) penalty for the extra LBA48 setup. But for "normal"
controllers (if such a beast even exists), the extra writes across
the PCI bus can be costly.

Hardware write-buffer FIFOs between the CPU and the PCI bus
can reduce the impact of this somewhat, but they are often
only 2-4 entries deep, and will be filled by a normal (S)ATA
command setup sequence.

This is one of those finer points that is very difficult to measure,
since the I/O throughput is pretty much unaffected by it. But CPU
cycle count per-I/O setup is one way to measure it.

Cheers
--
Mark Lord
(hdparm keeper & the original "Linux IDE Guy")

Jeff Garzik wrote:
Mark Lord wrote:

Bartlomiej Zolnierkiewicz wrote:


What determines whether 48 bit addressing will be used then?



Availability of 48-bit addressing feature set and host capabilities
(some don't support LBA48 when DMA is used etc.).



I haven't examined the "released" IDE drivers in some time,
but one optimisation that can save a LOT of CPU usage
is for the driver to only use LBA48 *when necessary*,
and use LBA28 I/O otherwise.

Each access to an IDE register typically chews up 600+ns,
or the equivalent of a couple thousand instruction executions
on a modern core. Avoiding LBA48 when it's not needed will
save four such accesses per I/O, or about 2.5us.



Doing this is either pointless or impossible on newer SATA controllers. Most are memory-mapped I/O not PIO, where the high-order bits of the ATA taskfile are accessed due to an extended register size, not "double-pumping" a FIFO.

Even-newer SATA controllers are FIS-based rather than taskfile-based, so you pass it a FIS (containing all the registers) unconditionally.

Jeff


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/