On Wed, Aug 19, 2020 at 09:14:13AM +0000, Damien Le Moal wrote:
While defining a zone append command for SCSI/ZBC is possible (using sense data
for returning the written offset), there is no way to define zone append for
SATA/ZAC without entirely breaking the ATA command model. This is why we went
after an emulation implementation instead of trying to standardized native
commands. That implementation does not have any performance impact over regular
writes *and* zone write locking does not in general degrade HDD write
performance (only a few corner cases suffer from it). Comparing things equally,
the same could be said of NVMe drives that do not have zone append native
support: performance will be essentially the same using regular writes and
emulated zone append. But mq-deadline and zone write locking will significantly
lower performance for emulated zone append compared to a native zone append
support by the drive.
And to summarize the most important point - Zone Append doesn't exist
in ZAC/ABC. For people that spent the last years trying to make zoned
storage work, the lack of such a primite has been the major pain point.
That's why I came up with the Zone Append design in response to a
request for such an operation from another company that is now heavily
involved in both Linux development and hosting Linux VMs. For ZAC and
ZBC the best we can do is to emulate the approach in the driver, but
for NVMe we can do it. ZNS until just before the release had Zone
Append mandatory, and it did so for a very good reason. While making
it optional allows OEMs to request drives without it, I fundamentally
think we should not support that in Linux and request vendors do
implement writes to zones the right way.
And just as some OEMs can request certain TPs or optional features to
be implemented, so can Linux. Just to give an example from the zone
world - Linux requires uniform and power of two zone sizes, which in
ZAC and ZBC are not required.