Re: [PATCH] [RFC] List per-process file descriptor consumption when hitting file-max
From: Valdis . Kletnieks
Date: Thu Jul 30 2009 - 10:18:15 EST
On Wed, 29 Jul 2009 19:17:00 +0300, Alexander Shishkin said:
>Is there anything dramatically wrong with this one, or could someone please review this?
> + for_each_process(p) {
> + files = get_files_struct(p);
> + if (!files)
> + continue;
> +
> + spin_lock(&files->file_lock);
> + fdt = files_fdtable(files);
> +
> + /* we have to actually *count* the fds */
> + for (count = i = 0; i < fdt->max_fds; i++)
> + count += !!fcheck_files(files, i);
> +
> + printk(KERN_INFO "=> %s [%d]: %d\n", p->comm,
> + p->pid, count);
1) Splatting out 'count' without a hint of what it is isn't very user friendly.
Consider something like "=> %s[%d]: open=%d\n" instead, or add a second line
to the 'VFS: file-max' printk to provide a header.
2) What context does this run in, and what locks/scheduling considerations
are there? On a large system with many processes running, this could conceivably
wrap the logmsg buffer before syslog has a chance to get scheduled and read
the stuff out.
3) This can be used by a miscreant to spam the logs - consider a program
that does open() until it hits the limit, then goes into a close()/open()
loop to repeatedly bang up against the limit. Every 2 syscalls by the
abuser could get them another 5,000+ lines in the log - an incredible
amplification factor.
Now, if you fixed it to only print out the top 10 offending processes, it would
make it a lot more useful to the sysadmin, and a lot of those considerations go
away, but it also makes the already N**2 behavior even more expensive...
At that point, it would be good to report some CPU numbers by running a abusive
program that repeatedly hit the limit, and be able to say "Even under full
stress, it only used 15% of a CPU on a 2.4Ghz Core2" or similar...
Attachment:
pgp00000.pgp
Description: PGP signature