Re: [concept & "good taste" review] persistent store

From: Linus Torvalds
Date: Sat Dec 18 2010 - 13:30:57 EST


On Fri, Dec 17, 2010 at 3:53 PM, Tony Luck <tony.luck@xxxxxxxxx> wrote:
> On Fri, Dec 17, 2010 at 3:11 PM, H. Peter Anvin <hpa@xxxxxxxxx> wrote:
>> There are two models I can think of:
>>
>> 1. a file where the head is automatically dropped as space requires.
>> 2. a filesystem where the oldest files are automatically reclaimed.
>>
>> 1 has been implemented in actual systems, 2 is kind of a logical extension.
>
> #2 sounds more applicable here (we have some multi-kilobyte
> blobs of data, one from each kmsg_dumper invocation - and
> it would seem useful to keep them as separate entities)

So I would argue that what we'd want is actually more of a mix of the two.

You want to have a ring of events, and into that ring you also have a
"this event has been read" pointer. And you _never_ overwrite entries
that haven't been read yet, because quite frankly, if you get some
nasty memory corruption, you may end up with a thousand oopses in
rapid succession, and the latter ones are likely to be just fallout
from the earlier ones. So you definitely don't want to overwrite the
earlier ones, because they are more likely to contain the clues about
the actual original cause.

At the same time, you do want to have the capability of saying "I've
seen this", and let it be overwritten. For example, if we end up
teaching syslogd or something like that to use this, syslogd would
write the oops to disk, do a fdatasync() on the oops file, and after
it's stable on disk it can mark it "read".

Also, since this is very much about persistent storage, I think any
events from a previous boot that still exists should be marked "read".
You still want to be able to read them (so marking something "read"
does not mean that it goes away), but if a new oops happens, you don't
want some old entries from long ago to stop it from being written to
persistent storage. So if you don't have any syslogd or any other tool
that saves things to disk, you'd still get the new oopses into
persistent storage.

Doesn't that sound like the best of both worlds?

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/