Re: [PATCH] Support single byte reads from integers published in procfs by kernel/sysctl.c

From: Eric W. Biederman
Date: Sun Jan 29 2012 - 19:12:36 EST

Next message: Eric W. Biederman: "Re: [PATCH 18/29] sysctl: Normalize the root_table data structure."
Previous message: Rusty Russell: "Re: + kmod-avoid-deadlock-by-recursive-kmod-call.patch added to -mm tree"
In reply to: Earl Chew: "Re: [PATCH] Support single byte reads from integers published inprocfs by kernel/sysctl.c"
Next in thread: Earl Chew: "Re: [PATCH] Support single byte reads from integers published inprocfs by kernel/sysctl.c"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Earl Chew <echew@xxxxxxxxxxx> writes:

>> If you are interested in fixing this properly with a tiny buffer
>> reachable from struct file I think this can be worth fixing. I think
>> this is doable by using seq_file in proc_sys_read.
>
> I've looked into making proc_sys_read() use seq_file, but there are a few
> issues to work through.
>
> I'm assuming that the existing kernel modules must continue to use the
> ctl_table/proc_dointvec/etc interface without requiring patching.
>
> The two main consequences of this are:
>
> a. The existing struct ctl_table interface should be preserved.
> Kernel modules use this to publish into procfs.

Generally. Changing all 200 or so users of the ctl_table interface is a
pain to audit and make certain you having done something silly.

> b. The existing proc_dointvec(), etc, functions must be preserved
> because they are exported via EXPORT_SYMBOL. Kernel modules
> may rely on these symbols in arbitrary ways.

You only have to worry about in kernel users, and a quick source audit
should allow to be certain that there are no arbitrary weird uses
of proc_dointvec.

> I tried use seq_file in proc_sys_read, and it comes close to solving
> the problem. The issue is that proc_dointvec() requires a __user pointer.
>
> A seq_file has an internal buffer that could be filled by proc_dointvec() -- but that
> seq_file buffer is in kernel space.
>
> This is not a problem isolated to the seq_file implementation. I think that
> any approach involving a small local buffer would mean that that buffer is
> in kernel space, and that buffer cannot be passed to proc_dointvec() as it
> stands now because proc_dointvec() requires a __user buffer.

We can definitely change the signature of proc_handler and thus
proc_dointvec and friends to be friendlier to an internal buffer.

For prototyping it probably makes sense to use set_fs so you can
use an internal buffer without needing to change the method.

> One approach that might work is to add a new field to struct ctl_table :
>
> struct ctl_table
> {
> ...
> proc_seq_handler *proc_seq_handler;
> };
>
> Existing modules can continue to use proc_dointvec() etc and fixed
> code can use the new proc_seq_handler interface using the new
> proc_seq_dointvec() etc.
>
>
> Although this preserves the old interface, it seems to me that this
> approach is flawed in that kernel modules using struct ctl_table and proc_dointvec()
> continue to be broken -- they are not fixed.

It might make a good intermediate transition scheme while existing
tables are being updated.

> So, perhaps breaking assumption (b) might be a reasonable thing to do ?

Definitely.

> If we accept that proc_dointvec() etc are only or mainly used in the
> context of filling out struct ctl_table, and that other kernel modules
> don't use proc_dointvec() in other contexts, then we can change the
> call signature of proc_dointvec() to stop using a __user pointer:

Exactly.

> 1. Change signature of proc_dointvec() etc to stop using __user pointer for reads
> 2. Change definition of typedef proc_call_handler to stop using __user pointer for reads
> 3. In kernel/sysctl.c don't use copy_to_user() for reads

All of which sound like nice cleanups, and likely to reduce the number
of places where people can badly get things wrong.

> Then proc_sys_read() can use proc_dointvec() etc to fill seq_file.

Yes. That sounds like a nice cleanup.

> For writes, the existing behaviour needs to be preserved. One approach
> that would solve this would be to add:
>
> union proc_buffer {
> void __user *uptr;
> void *kptr;
> };
>
> typedef int proc_handler (struct ctl_table *ctl, int write,
> union proc_buffer buffer, size_t *lenp, loff_t *ppos);

> I think this would be preferable to either casting away __user for reads, or
> adding two pointers to the proc_handler signature (one with __user, one without).
>
> If write is indicated, use buffer.uptr, and if not the handlers would use buffer.kptr.

For interface design I would look carefully at the bitmap interface that
comes out of sysctl. I think that is the most data interface we have.

I suspect what we want for writes is to read the existing value into an
internal buffer, and then allow partial writes to the internal buffer.

I don't know if writable seq_files exist.

> But this is not perfect. Kernel modules containing their _own_ proc_handler
> definitions would now be broken severely.
>
>
> And now I circle back to the proc_seq_handler approach. Each ctl_table client
> must be fixed individually (ie switch from proc_handler to proc_seq_handler).
>
> What do you think ?

It sounds like you are thinking things through well.

For a transition strategy we may want to change proc_handler into
operations structure instead of just a function pointer.

Only having a single method we can call sounds like a pain.

struct ctl_proc_operations {
int (*proc_handler)(....);
int (*proc_seq_handler)(...);
};

Or maybe:

struct ctl_proc_operations {
int (*proc_handler)(....);
int (*proc_seq_read_handler)(...);
int (*proc_seq_write_handler)(...);
};

Having the option of different read/write methods is a plus.

More usefully this should allow us to update what the implementation
of things like proc_dointvec mean without having to update all of
the tables. Which should make it much easier to preserve correctness,
because we don't a huge number of manual edits to ctl_table structures
all over the kernel.

Eric

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Eric W. Biederman: "Re: [PATCH 18/29] sysctl: Normalize the root_table data structure."
Previous message: Rusty Russell: "Re: + kmod-avoid-deadlock-by-recursive-kmod-call.patch added to -mm tree"
In reply to: Earl Chew: "Re: [PATCH] Support single byte reads from integers published inprocfs by kernel/sysctl.c"
Next in thread: Earl Chew: "Re: [PATCH] Support single byte reads from integers published inprocfs by kernel/sysctl.c"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]