Re: [PATCH] Freeze bdevs when freezing processes.

From: Nigel Cunningham
Date: Mon Oct 23 2006 - 19:22:52 EST


Hi.

On Mon, 2006-10-23 at 16:07 +0200, Rafael J. Wysocki wrote:
> On Monday, 23 October 2006 14:09, Nigel Cunningham wrote:
> > Hi.
> >
> > On Mon, 2006-10-23 at 12:36 +0200, Rafael J. Wysocki wrote:
> > > On Monday, 23 October 2006 06:12, Nigel Cunningham wrote:
> > > > XFS can continue to submit I/O from a timer routine, even after
> > > > freezeable kernel and userspace threads are frozen. This doesn't seem to
> > > > be an issue for current swsusp code,
> > >
> > > So it doesn't look like we need the patch _now_.
> >
> > I'm trying to prepare the patches to make swsusp into suspend2.
>
> Oh, I see. Please don't do that.

(Yes, echoing Andrew..) Why not?

> > Please assume it's part of a stream of patches that belong together, rather
> > than an isolated modification. Yes, I'd really rather just rm -f swap.c
> > user.c swsusp.c,
>
> And I'm against that, sorry. Also I don't think you can remove user.c just
> like that.

Are you saying you're against a stream of patches that belong together?
Well, I'm sorry but that's the way things work sometimes. I know
everyone loves this evolutionary model, but it just doesn't work in
practice. We see sets of patches going through all the time in which
[2/4] prepares the foundations for [3/4] and so on. That's what I'm
doing, I just don't yet know how big the n is in [2/n].

Regarding removing user.c, I said I'd rather, not that I'm going to
submit a patch to do it :)

> > but I'm trying to do the incremental modification thing for you.
> >
> > > > but is definitely an issue for Suspend2, where the pages being written could
> > > > be overwritten by Suspend2's atomic copy.
> > >
> > > And IMO that's a good reason why we shouldn't use RCU pages for storing the
> > > image. XFS is one known example that breaks things if we do so and
> > > there may be more such things that we don't know of. The fact that they
> > > haven't appeared in testing so far doesn't mean they don't exist and
> > > moreover some things like that may appear in the future.
> >
> > Ok. We can modify the selection of pages to be overwritten. I agree that
> > absence of proof isn't proof of absence, but on the later part, we can't
> > rule some modification out now because something in the future might
> > break it.

(s/RCU/LRU applied to avoid confusion)

> This case is a bit special. I don't think it would be right to require every
> device driver writer to avoid modifying LRU pages from the interrupt
> context, because that would break the suspend to disk ...

I'm not arguing for that.

> Besides, if there is an LRU page that we _know_ we can use to store the image
> in it, we can just include this page in the image without copying. This
> already gives us one extra free page for the rest of the image and we can
> _avoid_ creating two images which suspend2 does and which adds a _lot_ of
> complexity to the code.

Do you have a number for '_lot_'? It's easy to say, but it would be
better to be able to say "I counted n lines of code you could remove if
you didn't save a full image of memory".

> > > > We can address this issue by freezing bdevs after stopping userspace
> > > > threads, and thawing them prior to thawing userspace.
> > >
> > > This will only address the issues related to filesystems, but the RCU pages
> > > can also be modified from interrupt context by other code, AFAICT.
> >
> > Agreed. We can find some other way to address that.
>
> I think, to stat with, we should look for pages that we're deadly sure will
> _not_ be overwritten from the interrupt context (for example, the pages that
> are mapped read-only by the user space look promising). Then, we should
> estimate the number of these pages and consider if dealing with them in
> a special way is worth the effort.

I was thinking of coming at it from the other direction, since I expect
the pages that might be modified from an interrupt context to be
minimal. The fact that Suspend2 has worked for years now using this
algorithm seems to confirm that.

Regards,

Nigel

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/