Re: [PATCH] filp->f_pos not correctly updated in proc_task_readdir

From: Zhang Le
Date: Wed Mar 18 2009 - 03:27:46 EST


On 11:12 Tue 17 Mar , Eric W. Biederman wrote:
> Zhang Le <r0bertz@xxxxxxxxxx> writes:
>
> > filp->f_pos only get updated at the end of the function. Thus d_off of those
> > dirents who are in the middle will be 0, and this will cause a problem in
> > glibc's readdir implementation, specifically endless loop. Because when overflow
> > occurs, f_pos will be set to next dirent to read, however it will be 0, unless
> > the next one is the last one. So it will start over again and again.
> >
> > There is a sample program in man 2 gendents. This is the output of the program
> > running on a multithread program's task dir before this patch is applied:
> >
> > $ ./a.out /proc/3807/task
> > --------------- nread=128 ---------------
> > i-node# file type d_reclen d_off d_name
> > 506442 directory 16 1 .
> > 506441 directory 16 0 ..
> > 506443 directory 16 0 3807
> > 506444 directory 16 0 3809
> > 506445 directory 16 0 3812
> > 506446 directory 16 0 3861
> > 506447 directory 16 0 3862
> > 506448 directory 16 8 3863
> >
> > This is the output after this patch is applied
> >
> > $ ./a.out /proc/3807/task
> > --------------- nread=128 ---------------
> > i-node# file type d_reclen d_off d_name
> > 506442 directory 16 1 .
> > 506441 directory 16 2 ..
> > 506443 directory 16 3 3807
> > 506444 directory 16 4 3809
> > 506445 directory 16 5 3812
> > 506446 directory 16 6 3861
> > 506447 directory 16 7 3862
> > 506448 directory 16 8 3863
>
> I'm trying to understand what the problem that glibc
> runs into. Is this the glibc seekdir madness?
>
> Under which conditions have you seen this problem?

Please see my explanation at the bottom.

>
> The reason I ask is if this is triggered by seekdir and people really
> care then the current proc_task_readdir for the /proc/<pid>/task/
> directories is not quite sufficient. As threads come and go the d_off
> associated with a specific thread will change.
>
> Which means we probably should be returning:
>
> $ ./a.out /proc/3807/task
> --------------- nread=128 ---------------
> i-node# file type d_reclen d_off d_name
> 506442 directory 16 1 .
> 506441 directory 16 2 ..

Without my patch here, d_off will be 0. And so will be the following d_off
except the last one.

> 506443 directory 16 3809 3807
> 506444 directory 16 3811 3809
> 506445 directory 16 3814 3812
> 506446 directory 16 3863 3861
> 506447 directory 16 3864 3862
> 506448 directory 16 3865 3863
>
> Which is slightly better unfortunately it doesn't give us the
> guarantee that the d_off will be continually increasing.

Unfortunately, unlike directories on normal filesystem, there will never be any
other files in /proc/xxxx/task. So, to my understanding, d_off should be
increasing continuously, no?

OK, here is what is going on in glibc. I left it out intentionally so as not to
make the discuss unnecessarily complicated. So here it is:

First of all, we can see that the d_off does not get correctly updated.

Then, let's take a look at glibc's __GETDENTS function. In this function, there
are 3 implementations:
http://sources.redhat.com/git/gitweb.cgi?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/getdents.c;h=b33d1789adff11a04cbb1f6f5616bc8eed59418f;hb=cvs/master
Two of them will detect overflow, and lseek if necessary.

This means, although d_off in /proc/xxxx/task does not get updated correctly
everywhere, the problem of endless loop only occurs on certain platforms.
One of them is MIPS N32 ABI system.

Of course, there are other ways to work around this. However, to me, fixing this
would be the easiest and the most straight forward one. :)

Zhang, Le
http://zhangle.is-a-geek.org
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/