Re: [PATCH v2] seq_file: Unconditionally use vmalloc for buffer

From: Kees Cook
Date: Tue Mar 16 2021 - 15:20:05 EST

On Tue, Mar 16, 2021 at 12:43:12PM +0000, Al Viro wrote:
> On Tue, Mar 16, 2021 at 08:24:50AM +0100, Greg Kroah-Hartman wrote:
> > > Completely agreed. seq_get_buf() should be totally ripped out.
> > > Unfortunately, this is going to be a long road because of sysfs's ATTR
> > > stuff, there are something like 5000 callers, and the entire API was
> > > designed to avoid refactoring all those callers from
> > > sysfs_kf_seq_show().
> >
> > What is wrong with the sysfs ATTR stuff? That should make it so that we
> > do not have to change any caller for any specific change like this, why
> > can't sysfs or kernfs handle it automatically?
> Hard to tell, since that would require _finding_ the sodding ->show()
> instances first. Good luck with that, seeing that most of those appear
> to come from templates-done-with-cpp...

I *think* I can get coccinelle to find them all, but my brute-force
approach was to just do a debug build changing the ATTR macro to be
typed, and changing the name of "show" and "store" in kobj_attribute
(to make the compiler find them all).

> AFAICS, Kees wants to protect against ->show() instances stomping beyond
> the page size. What I don't get is what do you get from using seq_file
> if you insist on doing raw access to the buffer rather than using
> seq_printf() and friends. What's the point?

To me, it looks like the kernfs/sysfs API happened around the time
"container_of" was gaining ground. It's trying to do the same thing
the "modern" callbacks do with finding a pointer from another, but it
did so by making sure everything had a 0 offset and an identical
beginning structure layout _but changed prototypes_.

It's the changed prototypes that freaks out CFI.

My current plan consists of these steps:

- add two new callbacks to the kobj_attribute struct (and its clones):
"seq_show" and "seq_store", which will pass in the seq_file.
- convert all callbacks to kobject/kboj_attribute and use container_of()
to find their respective pointers.
- remove "show" and "store"
- remove external use of seq_get_buf().

The first two steps require thousands of lines of code changed, so
I'm going to try to minimize it by trying to do as many conversions as
possible to the appropriate helpers first. e.g. DEVICE_ATTR_INT exists,
but there are only 2 users, yet there appears to be something like 500
DEVICE_ATTR callers that have an open-coded '%d':

$ git grep -B10 '\bDEVICE_ATTR' | grep '%d' | wc -l

Kees Cook