Re: PROBLEM: pthread-safety bug in write(2) on Linux 2.6.x

From: Linus Torvalds
Date: Thu Apr 13 2006 - 18:40:25 EST




On Thu, 13 Apr 2006, Alan Cox wrote:
>
> The only serious case historically has been O_APPEND which does have
> pretty precise semantics. Nowdays we also have pread/pwrite which have
> pretty clear semantics and deal with threading. The O_APPEND case is
> very important to get correct and 2.4 certainly did so.

pread/pwrite automatically is safe.

Our O_APPEND handling should be safe - although since we do it at a FS
level it actually depends on the filesystem itself. Most (all that use the
generic routines at least) filesystems will get the inode semaphore for
writing, and do the position handling within that semaphore.

So we follow the specs, but..

> Outside of O_APPEND the specification says only that
> - The write starts at the file position
> - The file position is updated before the syscall returns
>
> It makes no other guarantee I can see.

Right. I think this is purely a "quality of implementation" issue. We
already follow the spec, the question is whether we want to be better than
that.

> As such I belive that the O_APPEND case must be kept locked properly and
> the non O_APPEND cases are already correctly handled by the kernel. That
> seems to argue for f_pos serialization on O_APPEND only.

f_pos doesn't really matter for O_APPEND, since we'll ignore it, and use
the file size as the position. Which is why the patch I sent out doesn't
matter (and which is why we already get O_APPEND right - we check the file
size within the inode semaphore/mutex).

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/