Re: 2.6.28, rlimits, performance and debian etch

From: Adam Tkac
Date: Thu Jan 29 2009 - 07:19:20 EST


On Tue, Jan 27, 2009 at 03:17:03PM -0800, Andrew Morton wrote:
> On Wed, 21 Jan 2009 12:52:19 +0100
> Peter Palfrader <weasel@xxxxxxxxxx> wrote:
>
> > Hi,
> >
> > I spent several hours trying to get to the bottom of a serious
> > performance issue that appeared on one of our servers after upgrading to
> > 2.6.28. In the end it's what could be considered a userspace bug that
> > was triggered by a change in 2.6.28. Since this might also affect other
> > people I figured I'd at least document what I found here, and maybe we
> > can even do something about it:
> >
> >
> > So, I upgraded some of debian.org's machines to 2.6.28.1 and immediately
> > the team maintaining our ftp archive complained that one of their
> > scripts that previously ran in a few minutes still hadn't even come
> > close to being done after an hour or so. Downgrading to 2.6.27 fixed
> > that.
> >
> > Turns out that script is forking a lot and something in it or python or
> > whereever closes all the file descriptors it doesn't want to pass on.
> > That is, it starts at zero and goes up to ulimit -n/RLIMIT_NOFILE and
> > closes them all with a few exceptions.
> >
> > Turns out that takes a long time when your limit -n is now 2^20 (1048576).
> >
> > With 2.6.27.* the ulimit -n was the standard 1024, but with 2.6.28 it is
> > now a thousand times that.
> >
> > 2.6.28 included a patch titled "rlimit: permit setting RLIMIT_NOFILE to
> > RLIM_INFINITY" (0c2d64fb6cae9aae480f6a46cfe79f8d7d48b59f)[1] that
> > allows, as the title implies, to set the limit for number of files to
> > infinity.
> >
> > Closer investigation showed that the broken default ulimit did not apply
> > to "system" processes (like stuff started from init). In the end I
> > could establish that all processes that passed through pam_limit at one
> > point had the bad resource limit.
> >
> > Apparently the pam library in Debian etch (4.0) initializes the limits
> > to some default values when it doesn't have any settings in limit.conf
> > to override them. Turns out that for nofiles this is RLIM_INFINITY.
> > Commenting out "case RLIMIT_NOFILE" in pam_limit.c:267 of our pam
> > package version 0.79-5 fixes that - tho I'm not sure what side effects
> > that has.
> >
> > Debian lenny (the upcoming 5.0 version) doesn't have this issue as it
> > uses a different pam (version).
> >
> >
> > I'm a bit unsure where to go from here. Maybe the pam library in etch
> > should be fixed. Maybe the patch should be reverted (but then it may be
> > more correct now and that's what the changelog entry suggests).
> > As a stopgap measure I could also just define nofile in limits.conf.
> >
> > Thanks for listening. Also thanks to Rik and Nocholas who helped track
> > some of this down.
> >
> > Cheers,
> > Peter
> > 1. http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=0c2d64fb6cae9aae480f6a46cfe79f8d7d48b59f
> > http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=0c2d64fb6cae9aae480f6a46cfe79f8d7d48b59f
>
> Ho hum, thanks.
>
> Well, I think we just revert it for now. We can bring it back later
> if someone is thus inclined. Along with some sort of opt-in control,
> perhaps in /proc. Which defaults to "off".
>

Hi all,

I don't think the "rlim_infinity" patch should be reverted. Let me try
to explain why.

First, code which sets limits to RLIM_INFINITY is very bad idea and
that code is Debian specific. I downloaded original pam 0.79 (stripped):

for(i = 0; i < RLIM_NLIMITS; i++) {
...
int r = getrlimit(i, &pl->limits[i].limit);
...

as you can see original pam sets limits to inherited defaults. After
code written above Debian adds their patch:

if (limits_not_defined_in_limits_conf) {
...
case RLIMIT_NOFILE:
...
pl->limits[i].limit.rlim_cur = RLIM_INFINITY;
pl->limits[i].limit.rlim_max = RLIM_INFINITY;
...
}

so as you can see inherited default limits are overriden to infinity.
In my opinion Debian should revert their patch which is, at least,
pretty incorrect and unsecure.

Next argument is POSIX compatibility (from setrlimit() specification):

"The value RLIM_INFINITY, defined in <sys/resource.h>, shall be
considered to be larger than any other limit value. If a call to
getrlimit() returns RLIM_INFINITY for a resource, it means the
implementation shall not enforce limits on that resource. Specifying
RLIM_INFINITY as any resource limit value on a successful call to
setrlimit() shall inhibit enforcement of that resource limit."

So kernel does what is expected. If you want "unlimited" number of
descriptors, you have it.

Please consider again where exactly problem is, if in Debian patch or
in kernel patch. From my point of view Debian patch should be
reverted, not the kernel one.

Please add me to CC, I'm not member of the list.

Regards, Adam

___
Adam Tkac
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/