Re: [patch] pipe: add support for shrinking and growing pipes

From: Michael Kerrisk
Date: Sun May 23 2010 - 05:25:11 EST


On Sun, May 23, 2010 at 9:09 AM, Jens Axboe <jens.axboe@xxxxxxxxxx> wrote:
> On Sun, May 23 2010, Michael Kerrisk wrote:
>> On Sun, May 23, 2010 at 4:38 AM, Andrew Morton
>> <akpm@xxxxxxxxxxxxxxxxxxxx> wrote:
>> > On Sun, 23 May 2010 07:30:01 +0200 Michael Kerrisk <mtk.manpages@xxxxxxxxx> wrote:
>> >
>> >> Hi all,
>> >>
>> >> I see that this patch has hit Linus's git, so some questions
>> >>
>> >> On Wed, May 19, 2010 at 6:49 PM, Linus Torvalds
>> >> <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>> >> >
>> >> >
>> >> > On Wed, 19 May 2010, Miklos Szeredi wrote:
>> >> >>
>> >> >> One issue I see is that it's possible to grow pipes indefinitely.
>> >> >> Should this be restricted to privileged users?
>> >> >
>> >> > Yes. But perhaps only if it grows past the default (or perhaps "default*2"
>> >> > or similar). That way a normal user could shrink the pipe buffers, and
>> >> > then grow them again if he wants to.
>> >> >
>> >> > Oh, and I think you need to also require that there be at least two
>> >> > buffers. Otherwise we can't guarantee POSIX behavior, I think.
>> >>
>> >> Is there any documentation (e.g., a man-pages patch) for these changes?
>> >>
>> >> The argument of the fcntl() operations is expressed in pages. I take
>> >> it that this means that the semantics of the argument will very
>> >> depending on the system page size? So for example, 2 on x86 will mean
>> >> 8192 bytes, but will mean 32768 of ia64? That seems very weird. (And
>> >> what about architectures where the page size is switchable?) Such
>> >> changes in semantics should not be silent for the use, IMO.
>> >
>> > Well, there is getpagesize().  But I agree - this interface is just
>> > asking (x86) people to write non-portable code.
>> >
>> > otoh, if the arg was in bytes, they'd just hard-code "8192".  They're
>> > clever like that.
>> >
>> > But we have gone to some lengths to avoid exposing things like
>> > PAGE_SIZE and HZ in procfs, so it makes sense to take the same approach
>> > to syscalls.
>>
>> Quite. All of the other memory-related APIs that I can think of
>> require the user to express the info in bytes. (mlock(),
>> remap_file_pages(), mmap(), mremap(), mprotect(), shmget(), and so
>> on). Not doing the same for this interface is needlessly inconsistent.
>> And while there will be the silly users you mention above, smart users
>> will know how to do the right thing with a consistently designed
>> interface.
>
> We can easily make F_GETPIPE_SZ return bytes, but I don't think passing
> in bytes to F_SETPIPE_SZ makes a lot of sense. The pipe array must be a
> power of 2 in pages. So the question is if that makes the API cleaner,
> passing in number of pages but returning bytes? Or pass in bytes all
> around, but have F_SETPIPE_SZ round to the nearest multiple of pow2 in
> pages if need be. Then it would return a size at least what was passed
> in, or error.

I'd recommend this: Pass it in and out in bytes. Don't round to a
power of 2. Require the user to know what they are doing. Give an
error if the user doesn't supply a power-of-2 * page-size for
F_SETPIPE_SZ. (Again, consider the case of architectures with
switchable page sizes.)

Cheers,

Michael

--
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Author of "The Linux Programming Interface" http://blog.man7.org/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/