Re: Linux 2.6.29

From: Andrew Morton
Date: Sat Apr 04 2009 - 04:24:59 EST


On Fri, 03 Apr 2009 14:59:12 -0400 Jeff Garzik <jeff@xxxxxxxxxx> wrote:

> Lennart Sorensen wrote:
> > On Fri, Apr 03, 2009 at 10:46:34AM -0400, Mark Lord wrote:
> >> My Myth box here was running 2.6.18 when originally set up,
> >> and even back then it still took *minutes* to delete large files.
> >> So that part hasn't really changed much in the interim.
> >>
> >> Because of the multi-minute deletes, the distro shutdown scripts
> >> would fails, and power off the box while it was still writing
> >> to the drives. Ouch.
> >>
> >> That system has had XFS on it for the past year and a half now,
> >> and for Myth, there's no reason not to use XFS. It's great!
> >
> > Mythtv has a 'slow delete' option that I believe works by slowly
> > truncating the file. Seems they believe that ext3 is bad at handling
> > large file deletes, so they try to spread out the pain. I don't remember
> > if that option is on by default or not. I turned it off.
>
> It's pretty painful for super-large files with lots of metadata.
>

yeah.

There's a dirty hack you can do where you append one byte to the file
every 4MB, across 1GB (say). That will then lay the file out on-disk as

one bitmap block
one data block
one bitmap block
one data block
one bitmap block
one data block
one bitmap block
one data block
<etc>
lots-of-data-blocks

So when the time comes to delete that gigabyte, the bitmaps blocks are
only one block apart, and reading them is much faster.

That was one of the gruesome hacks I did way back when I was in the
streaming video recording game.

Another was the slow-delete thing.

- open the file

- unlink the file

- now sit in a loop, slowly nibbling away at the tail with
ftruncate() until the file is gone.

The open/unlink was there so that if the system were to crash midway,
ext3 orphan recovery at reboot time would fully delete the remainder of
the file.


Another was to add an ioctl to ext3 to extend the file outside EOF, but
only metadata - the corresponding data blocks are left uninitialised.
That permitted large amount of data blocks to be allocated to the file
with high contiguity, fixing the block-intermingling problems when ext3
is writing multiple files (which reservations later addressed).

This is of course insecure, but that isn't a problem on an
embedded/consumer black box device.


ext3 sucks less nowadays, but it's still a hard vacuum.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/