Re: [Patch 0/4] RFC : Support for data gradation of a single file.

From: Andreas Dilger
Date: Fri Apr 06 2018 - 17:31:52 EST


On Apr 6, 2018, at 5:41 AM, Sayan Ghosh <sgdgp.2014@xxxxxxxxx> wrote:
>
> Hi all,
>
> The following series of patches aim to store a file with a graded
> information. Consider a scenario of video indexing for learning
> programme where some of the portions of the video is annotated and
> important than other portions, hence to be accessed more often. We
> consider the similar scenario where we have a file along with a grade
> information that mentions which blocks are important and which are
> not. The grades we consider are binary with 1 denoting high grade.
> Now the file is stored in a LVM which comprises of different set of
> storage devices belong to different tiers (as ext4 doesnât support
> spanning over multiple block driver), - one combination could be
> persistent memory and hard-disk. The target is to store the higher
> graded blocks in the higher performance tier and the lower graded
> blocks in the lower performance tier.
> Consider a C code where the grade of the file blocks are being set in
> the user space through extended attribute. The grade structure stores
> the span of different high graded segments in the file with starting
> high grade block numbers and the span length of the segments. We
> assume grade of rest of the blocks as 0 (low).

There was a considerable amount of work and discussion on implementing
Stream IDs for the block layer. This would annotate writes from userspace
and allow the underlying storage (filesystem and block layer) to use the
stream ID for block allocation. See the following for more details:

https://lwn.net/Articles/717755/
https://lwn.net/Articles/726477/
http://lists.infradead.org/pipermail/linux-nvme/2017-June/011322.html

In the absence of other information, the Stream ID would just mean "group
allocations with the same ID together". After some discussion, it looks
like the latest patch has generic "lifetime" hints rather than "stream IDs",
but the end result is largely the same.

It would make sense for you to spend time testing and fixing that patch
series instead of trying to introduce a new interface. IMHO, there is
no need to make these hints persistent on disk, since their state could
be inferred by the allocation placement directly.

Cheers, Andreas





Attachment: signature.asc
Description: Message signed with OpenPGP