Re: Question regarding MAX_ARG_STRLEN with execve()

From: Michal Hocko
Date: Mon Jul 03 2017 - 05:22:05 EST


On Mon 03-07-17 13:58:59, Anshuman Khandual wrote:
> On 06/30/2017 07:52 PM, Michal Hocko wrote:
> > On Fri 30-06-17 11:59:37, Anshuman Khandual wrote:
> >> Hello,
> >>
> >> execve() system call should support argument length of
> >> MAX_ARG_STRLEN (PAGE_SIZE * 32). On 64K page size systems, we
> >> are not able to pass 32 * PAGE_SIZE arguments into the execve()
> >> system call because of the following reasons.
> >>
> >> * struct linux_binprm's vma starts with a size of PAGE_SIZE
> >>
> >> vma->vm_end = STACK_TOP_MAX;
> >> vma->vm_start = vma->vm_end - PAGE_SIZE;
> >>
> >> * The VMA expands as much depending upon the argument size. So
> >> for 32 * PAGE_SIZE argument, it becomes 33 * PAGE_SIZE.
> >>
> >> * 33 * PAGE_SIZE with 64K pages fails the following test in
> >> get_arg_page() function. 33 * PAGE_SIZE is more than 2MB
> >> (8 MB /4) with 64K page size.
> >>
> >> if (size > READ_ONCE(rlim[RLIMIT_STACK].rlim_cur) / 4)
> >>
> >> * Right now RLIMIT_STACK is hard coded 8MB which does not take
> >> PAGE_SIZE into account.
> >>
> >> Wondering what should be the solution for this problem ?
> >>
> >> * Change the default stack size from 8MB ?
> > just increase the ulimit if you want to use such a large arguments.
> >
>
> Yeah that is possible but it does not still offset the fact that
> the calculation is broken on the page size of 64K. I mean, yeah
> its not practical to have such a large argument. But the point
> is whether we would want to support the MAX_ARG_STRLEN semantic
> for execve system call or not. At present its broken for 64K
> and I am asking whether we will be willing to revisit the
> '1/4th of the stack' condition.

I dunno. We have this 1/4 of RLIMIT semantic for years and it doesn't
seem there were any bug reports. Yes, MAX_ARG_STRLEN being PAGE_SIZE
dependent is unfortunate because it makes an arch independent default
ulimit hard to get right but I am not sure we actually have to lose
sleep over this.

Or do you have any specific proposal how to "fix" this limitation which
wouldn't break other userspace?
--
Michal Hocko
SUSE Labs