Re: [PATCH 2/8] pstore: Expose kmsg_bytes as a module parameter

From: Kees Cook
Date: Wed Oct 12 2022 - 13:58:59 EST


On Wed, Oct 12, 2022 at 12:33:36PM -0300, Guilherme G. Piccoli wrote:
> On 06/10/2022 20:32, Kees Cook wrote:
> > [...]
> > Doing a mount will override the result, so I wonder if there should be
> > two variables, etc... not a concern for the normal use case.
> >
> > Also, I've kind of wanted to get rid of a "default" for this and instead
> > use a value based on the compression vs record sizes, etc. But I didn't
> > explore it.
> >
>
> For some reason I forgot to respond that, sorry!
>
> I didn't understand exactly how the mount would override things; I've
> done some tests:
>
> (1) booted with the new kmsg_bytes module parameter set to 64k, and it
> was preserved across multiple mount/umount cycles.
>
> (2) When I manually had "-o kmsg_bytes=16k" set during the mount
> operation, it worked as expected, setting the thing to 16k (and
> reflecting in the module parameter, as observed in /sys/modules).

What I was imagining was the next step:

(3) umount, unload the backend, load a new backend, and mount it
without kmsg_bytes specified -- kmsg_bytes will be 16k, not 64k.

It's a pretty extreme corner-case, I realize. :) However, see below...

> In the end, if you think properly, what is the purpose of kmsg_bytes?
> Wouldn't make sense to just fill the record_size with the maximum amount
> of data it can handle? Of course there is the partitioning thing, but in
> the end kmsg_bytes seems a mechanism to _restrict_ the data collection,
> so maybe the default would be a value that means "save whatever you can
> handle" (maybe 0), and if the parameter/mount option is set, then pstore
> would restrict the saved size.

Right, kmsg_bytes is the maximum size to save from the console on a
crash. The design of the ram backend was to handle really small amounts
of persistent RAM -- if a single crash would eat all of it and possibly
wrap around, it could write over useful parts at the end (since it's
written from the end to the front). However, I think somewhere along
the way, stricter logic was added to the ram backend:

/*
* Explicitly only take the first part of any new crash.
* If our buffer is larger than kmsg_bytes, this can never happen,
* and if our buffer is smaller than kmsg_bytes, we don't want the
* report split across multiple records.
*/
if (record->part != 1)
return -ENOSPC;

This limits it to just a single record.

However, this does _not_ exist for other backends, so they will see up
to kmsg_bytes-size dumps split across psinfo->bufsize many records. For
the backends, this record size is not always fixed:

- efi uses 1024, even though it allocates 4096 (as was pointed out earlier)
- zone uses kmsg_bytes
- acpi-erst uses some ACPI value from ACPI_ERST_GET_ERROR_LENGTH
- ppc-nvram uses the configured size of nvram partition

Honestly, it seems like the 64k default is huge, but I don't think it
should be "unlimited" given the behaviors of ppc-nvram, and acpi-erst.
For ram and efi, it's effectively unlimited because of the small bufsizes
(and the "only 1 record" logic in ram).

Existing documentation I can find online seem to imply making it smaller
(8000 bytes[1], 16000 bytes), but without justification. Even the "main"
documentation[2] doesn't mention it.

-Kees

[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/ABI/testing/pstore
[2] https://docs.kernel.org/admin-guide/ramoops.html

--
Kees Cook