Re: [PATCH 1/2] fs: Do not dispatch FITRIM through separatesuper_operation

From: Lukas Czerner
Date: Thu Jun 02 2011 - 04:15:15 EST


On Thu, 2 Jun 2011, Kyungmin Park wrote:

> On Wed, Dec 8, 2010 at 1:52 AM, Chris Mason <chris.mason@xxxxxxxxxx> wrote:
> > Excerpts from Christoph Hellwig's message of 2010-12-07 04:27:49 -0500:
> >> On Fri, Nov 19, 2010 at 10:21:35AM -0500, Mark Lord wrote:
> >> > >I really hate to rely on this third party hearsay (from all sides), and
> >> > >have implement TRIM support in qemu now.  I'll soon install win7 and
> >> > >will check out the TRIM patters myself.
> >> >
> >> > Excellent!
> >>
> >> I did a Windows 7 installation under qemu today, and the result is:
> >
> > Great, thanks for testing this.
> >
> >>
> >>  - it TRIMs the whole device early during the installation
> >>  - after that I see a constant stream of small trims during the
> >>    installation.  It's using lots of non-contiguous ranges in a single
> >>    TRIM command, with sizes down to 8 sectors (4k) for a single range.
> >>  - after installation there's is some background-trimming going on
> >>    even when doing no user interaction with the VM at all.
>
> Hi Lukas,
>
> Now FITRIM is based on user interaction. So how about to implement the
> AUTO batched discard at kernel level?
> Idea is same as windows, make a single thread and iterate the
> superblocks and call the trim.
>
> here's pseudo codes.
>
> 1. generate the trim thread.
> 2. iterate the superblocks by iterate_supers() at fs/super.c
> 3. check the queue which support the discard feature or not.
> blk_queue_discard(q)
> 4. wait on events
> 5. call the sb->trim (need to re-introduce it)
>
> The difficult things are how to define the events and how to trigger
> the trim thread.
> e.g., notified from block layer, called from filesystem and so on.
>
> How do you think?

Hi Kyungmin,

generally I think this is a good idea and I thought about it as well.
However I also think that we might want to wait for the FITRIM and
discard supported devices to settle down to see how it performs and if
frequently calling discard on big chunks of the device does not have
some unwanted consequences (as Dave Chinner pointed out in a different
thread, that this automations usually do).

Regarding events, filesystem might watch amount of data written to it
(and it usually do right now) and trigger the event when it exceeds,
let's say 50% of the fs size and zero the counter. The downside of this
is that it is not controlled behaviour hence we'll end up with
unpredictable behaviour of the filesystem in the long term.

The solution might be (and it is something I want to look into)
infrastructure to determine size of the queue to the device from within
the filesystem (or vfs) so we can tell when it is busy and we want to wait,
or when it is doing nothing and we can discard.

Also per filesystem, or per device, control will be needed, so we can
selectively turn it on and off. But as I said I am not sure that this is
the best idea to do it right now, but others might have different
opinion.

Thanks!
-Lukas

>
> Thank you,
> Kyungmin Park
>
> >>  - removing files leads to an instant stream of TRIMs, again vectored
> >>    and of all sizes down to 4k.  Note that the TRIMs are a lot more
> >>    instant than even with btrfs and -o discard, which delays most
> >>    TRIMs until doing a sync.
> >
> > Btrfs will do some small trims right when the block is freed, especially
> > in fsync heavy workloads but this is a suboptimal thing I want to fix.
> >
> > The code tries to gather a whole transaction worth of trims and do them
> > after the commit is done.
> >
> > -chris
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to majordomo@xxxxxxxxxxxxxxx
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at  http://www.tux.org/lkml/
> >
>

--