Re: [PATCH 2/2] block: create ioctl to discard-or-zeroout a range of blocks

From: Ric Wheeler
Date: Thu Mar 17 2016 - 13:50:57 EST


On 03/17/2016 01:47 PM, Linus Torvalds wrote:
On Wed, Mar 16, 2016 at 10:18 PM, Gregory Farnum <greg@xxxxxxxxxxx> wrote:
So we've not asked for NO_HIDE_STALE on the mailing lists, but I think
it was one of the problems Sage had using xfs in his BlueStore
implementation and was a big part of why it moved to pure userspace.
FileStore might use NO_HIDE_STALE in some places but it would be
pretty limited. When it came up at Linux FAST we were discussing how
it and similar things had been problems for us in the past and it
would've been nice if they were upstream.
Hmm.

So to me it really sounds like somebody should cook up a patch, but we
shouldn't put it in the upstream kernel until we get numbers and
actual "yes, we'd use this" from outside of google.

I say "outside of google", because inside of google not only do we not
get numbers, but google can maintain their own patch.

But maybe Ted could at least post the patch google uses, and somebody
in the Ceph community might want to at least try it out...

What *is* a big deal for
FileStore (and would be easy to take advantage of) is the thematically
similar O_NOMTIME flag, which is also about reducing metadata updates
and got blocked on similar stupid-user grounds (although not security
ones): http://thread.gmane.org/gmane.linux.kernel.api/10727.
Hmm. I don't hate that patch, because the NOATIME thing really does
wonders on many loads. NOMTIME makes sense.

It's not like you can't do this with utimes() anyway.

That said, I do wonder if people wouldn't just prefer to expand on and
improve on the lazytime.

Is there some reason you guys didn't use that?

As noted though, we've basically given up and are moving to a
pure-userspace solution as quickly as we can.
That argues against worrying about this all in the kernel unless there
are other users.

Linus

Just a note, when Greg says "user space solution", Ceph is looking at writing directly to raw block devices which is kind of a through back to early enterprise database trends.

Ric