Re: PATCH reduce impact of FIFREEZE on userland processes

From: Alun
Date: Fri Dec 07 2012 - 03:59:57 EST


Dave Chinner <david@xxxxxxxxxxxxx> said, in message
20121207004255.GC27172@dastard:
>
> The problem wth doing this is that the sync can delay the freeze
> process by quite some time under the exact conditions you describe.
> If you want freeze to take effect immediately (i.e instantly stop
> new modifications), then adding a sync will break this semantic.
> THere are existing users of freeze that require this behaviour...

Ahh, that would be the subtlety I was worried might exist! Thanks.

The specific issue that brought me here was that, on a fairly heavily
loaded file server (>1000 connected Windows clients), taking an LVM
snapshot caused enough of an interruption to service that many of the
Windows clients disconnected and reconnected, so causing a huge process
load on the server - enough that we'd completely lose service and have
to reboot. Chasing this down, I noticed that FIFREEZE does a filesystem
sync, and it seemed to me that adding another one prior to blocking
writes was an easy hit.

I'm not trying to argue my case here - you've convinced me that this
change in semantics is risky and removes flexibility.

I'll try and chase this up by submitting patches to lvcreate and
fsfreeze (in the former case, I think there's no reason not to run
syncfs; in the latter perhaps it should be a command line option).

> That, to me, is irrelevant, because something is normally done while
> the filesystem is frozen. It's not uncommon for freeze periods to
> extend to minutes while work is done by whatever required the
> freeze. Hence the few seconds it takes to acheive the frozen state is
> mostly irrelevant.

You've referred twice to existing systems that would break in the
presence of this change. I'm really having trouble thinking of a
situation where it's critical to have writes suspended *NOW* and where
it's valid to keep them suspended for minutes. I'd have thought that,
in the vast majority of cases, the critical thing was to minimise the
time for which writes were suspended. Would you mind describing the use
case you're thinking of?

Cheers,
Alun.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/