Re: libata in 2.4.24?

From: Bill Davidsen
Date: Sun Dec 07 2003 - 00:59:56 EST


On Wed, 3 Dec 2003, Jamie Lokier wrote:

> bill davidsen wrote:
> > With O_SYNC files there is the possibility of having a don't cache bit
> > in the packet to the drive, even with write caching. With fsync I don't
> > see any way to do it after the fact for only some of the data in the
> > drive cache. That's just an observation.
>
> With fsync, can't you write all the dirty pages with that bit set,
> write _again_ all the pages in RAM which are clean but which have
> never been written with the don't-cache bit, and read-then-write with
> the bit set all the pages which are not in RAM but which were dirtied
> and written without the don't cache bit set?

Actually, what I meant was to pass the bit to the drive, so that it would
not return completion status until physical completion. That can be done
for O_SYNC. But fsync() is after the fact, the o/s has no way of knowing
that the pages already sent to the drive have been written to media except
to do a cache flush and flush everything.

I don't think there's anything else you can do with fsync() with or
without the bit, since you may no longer have the buffers to send with the
bit set. Moreover, some drives, reportedly IBM, tend to botch a sevtor
being written during power fail, if I understand your proposal it *cold*
result in a buffer being sent to the drive twice, and a power fail during
the 2nd write could clobber the good data you wrote.

That's assuming I understand what you propose.

>
> I know, it sounds a bit complicated :)
>
> But would it work?

Maybe a guru will disagree, but I would say that just switching to what
SCSI does and caching the write on the drive and not returning done status
until it completes is the right solution. Maybe a drive vendor will do
that, maybe it's in SATA-2 spec, I was just making up the bit which
indicated a realtime write buffer, as a way O_SYNC could work without
killing performance.

--
bill davidsen <davidsen@xxxxxxx>
CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/