Re: [rfc] Ignore Fsync Calls in Laptop_Mode

From: Ted Ts'o
Date: Thu May 26 2011 - 12:21:51 EST


On Thu, May 26, 2011 at 06:05:43PM +0200, D. Jansen wrote:
> Problem: any fsync call by any application spins up the hard disk any
> time even in laptop_mode

What you call a problem, I call a feature. If an application doesn't
participate in the write aggregation protocol, the worst that happens
is that you waste battery power. This I consider as a lesser evil
than data loss.

Similarly, if an application really _needs_ to write disk, and it
can't contact the coordinating daemon, or the coordinating daemon
doesn't respond in a reasonable amount of time, the application should
feel free to write the data to disk and fsync(). This might waste a
bit of power, but power is cheaper than lost data.

> Because though there is no possibility to destroy data that is on disk
> due to non FIFO flushing of application writes queued in the kernel,
> which seems to be the main kernel level problem, yet new problems come
> up.

I'm not sure what you're talking about here. Buffered data can always
be reordered in terms of when it is written to disk. This is
considered good, and normal. If you want to guarantee that
application writes are pushed out to disk, then either (a) use
O_DIRECT, or (b) use fsync(). Those are your two options.

If we didn't (for example) reorder writes to avoid the hard disk head
from seeking all over the disk, that would actually cause more power
to be consumed!

> Now there is
> 1) special support needed on the application side.

Yep, because this is fundamentally an application-level problem, and
the kernel doesn't have enough semantic information to solve the
database coherency problem.

> 2) need for new out-of-kernel buffers.

Yes. So?

> 3) need for inter-application write alignment nightmares. This sort of
> structure could cause very uncomfortable bugs that prevent writes from
> happening at all in cases that were not foreseen at all.

Huh? I think you are talking about order that buffered writes happen,
and there's no problem here. It's a feature that they can be
reordered. See above.

> 4) need for resources wasted through yet another daemon.

A daemon doesn't have to take up much space. If it is linked with all
of the GNOME libraries in the world, yeah, there'll be a problem, but
there's no reason that this daemon should take more than, say a few
tens of kilobytes at most.

> 5) If the _application_, but not the kernel crashes, the data is safe.
> In my experience this is the much more likely case than that the mail
> server on my netbook optimized for battery time receives an email in
> laptop mode, sends the other server "200" and then before the next
> commit window my battery slips out and it's all gone.

Huh? What's the problem that you're worried about here.

> I think the alternative of ensuring the application writes are
> committed in order would make more sense:
> e..g a _user space library_ disables fsync etc. in laptop_mode if the
> user chooses to do so and kernel support for forced FIFO ordering or
> writes.
> This would fix 1) 2) 3) 4) 5) 6).

And if you do this to a mysql daemon, or to a firefox or chrome
process which uses sqllite, and you crash at a wrong time, the entire
database could be scrambled. You can't fix this with your solution,
because you want to make fsync() lie to the database code. And so all
of the extra work (and power) consumed by the database code to try to
make its database writes be safe, will be compromised by making
fsync() unreliable.

> So you've re-thought this "All that is necessary is a kernel patch to
> allow laptop_mode to disable fsync() calls(...)"
> (http://tytso.livejournal.com/2009/03/15/). That post had inspired my
> patch.

I was thinking about things only from a file system perspective. The
problem is that more and more people are running databases or other
binary files which are updated in place on their laptops, and from a
more holistic perspective, we have to worry about making sure that
application-level databases are coherent in the face of a system
crash. (For example, you drop your mobile phone, or your tablet, or
your laptop, and the battery slips out.)

- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/