Re: [Lsf-pc] [LSF/MM TOPIC] multi-stream IO hint implementation proposal for LSF/MM 2016
From: Dave Chinner
Date: Wed Feb 17 2016 - 18:36:36 EST
On Wed, Feb 17, 2016 at 04:21:55PM +0100, Jan Kara wrote:
> On Sat 13-02-16 01:50:09, Changho Choi-SSI wrote:
> > Dear Program committee,
> >
> > I wanted to propose a technical discussion.
> > Please let me know if there is anything else that I have to submit and/or
> > prepare.
>
> As a side note: It is good to CC other relevant mailing lists so that
> corresponding developers can react to the proposal.
>
> > ==
> > Linux Kernel Multi-stream I/O Hint Implementation
> >
> > Enterprise, datacenter, and client systems increasingly deploy NAND
> > flash-based SSDs. However, in use, SSDs cannot avoid inevitable garbage
> > collection that deterministically causes write amplification which
> > decreases device performance. Unfortunately, write amplification also
> > decreases SSD lifetime. However, with multi-stream, unavoidable garbage
> > collection overhead (e.g., write amplification) can be significantly
> > reduced. For multi-stream devices, the host tags device I/O write
> > requests with a stream ID (e.g., I/O hint). The SSD controller places the
> > data in media erase blocks according to the stream ID. For example, a SSD
> > controller stores data with same stream ID in an associated physical
> > location inside SSD. In this way, the multi-stream depends on host I/O
> > hints. So it is useful to develop how to implement multi-stream I/O hints
> > under limited protocol constraints. The T10 SCSI standard group has
> > already standardized the multi-stream feature and NVMe standardization is
> > an ticipated in March, 2016. Many Linux users want to leverage
> > multi-stream as a mainstream Linux feature since they have seen
> > performance improvement and SSD lifetime extension when evaluating
> > multi-stream enabled devices. Hence, the multi-stream feature is a good
> > Linux community development candidate and should be discussed within the
> > community. I propose this multi-stream topic (i.e., I/O write hint
> > implementation) in a discussion session. I can briefly present the
> > multi-stream system architecture and answer any technical questions.
>
> So a key question for a feature like this is: How many stream IDs are
> devices going to support? Because AFAIR so far the answer was "it depends
> on the device". However the design how stream IDs can be used greatly
> differs between "a couple of stream IDs" and e.g. 2^32 stream IDs. Without
> this information I don't think the discussion would be very useful. So can
> you provide some rough numbers?
To me, the biggest problem these hint proposals have had in the past
are with the user facing API. Passing hints through the kernel IO
stack isn't a huge issue - it's how to get them into the kernel,
what defaults should be used when they are not provided, whether the
kernel can reserve streams for it's own use (i.e. journal and
metadata streams), how to assigning valid stream ids outside of the
IO call interface consistently across different filesystems, whether
stream IDs should be persistent for an inode, error behaviour when
an invalid stream ID is used, etc.
I'd expect any discussion to get stuck on these sort of topics
again, not on the nuts and bolts of the tech or plumbing the depths
of the IO stack...
Cheers,
Dave.
--
Dave Chinner
david@xxxxxxxxxxxxx