Patch?: linux-2.5.30/drivers/block/loop.c big cleanup

From: Adam J. Richter (adam@yggdrasil.com)
Date: Mon Aug 05 2002 - 03:58:12 EST


        The following patch is a big cleanup of loop.c. The
highlights are:

        1. Do you ever wonder why everyone was reporting bio_copy
           running out of memory even when the maximum trasfer size for
           a bio was typically 128kB and your system had hundreds of
           megabytes of RAM? There was a bug in loop.c where
           there were two levels of iteration walking through the
           bvecs in a bio. As a result, a bio to transfer n pages
           would usually result a transfer of n pages, a transfer of
           n-1 pages, n-2 pages, etc. for a total of n*(n-1)/2, all
           being queued immediately. I have not yet accomodated bio_copy
           failing (working on it), but this bug being fixed should make
           it happen much less often and should make loop devices faster.

        2. Each /dev/loop device now has the same DMA parameters as
           its underlying device. So, bio producers can submit
           bios at the maximum size that the underlying device can
           handle.

        3. Space for loop devices is kmalloc'ed as they are set up.
           There is now almost no cost to having a high max_loop.
           max_loop is going away soon anyhow.

        4. No sector copying. Data is either remapped or there is some
           data transformation. The stock linux-2.5.30 loop.c did
           this also, but nobody realized it. It would do data
           copying in some cases, but would never use the copied
           data. It just wasted lots of memory bandwidth.

        5. Deleted some unnecessary locking, and replaced lo_sem
           with lo_thread_exited (a struct completion rather than a
           semaphore, making it smaller and avoiding a potential
           problem with waiting on a semaphore to deallocate memory
           that it occupies as described to me by some nice people on
           irc.kernelnewbies.org whose names I don't remember).

        I have tested this with directly mapping a disk partition
and also from running from an encrypted disk partition. More testing would
be appreciated, especially of the file mapped case.

        Known bug:

        1. The module referencing counting in fs/block_dev.c does not
           get along well with this. After creating and deleting
           a loop device, the reference count of the modules is -1.
           I think this may be a bug in block_dev.c, but I'm still
           looking into it. (I think not being able to unload the
           module is a smaller bug than the bio_copy failures that this
           patch fixes.)

        To do:

        1. Eliminate max_loop. Let users open as many /dev/loop
           devices as they want. Either leave the devfs device
           creation to user level devfsd or create /dev/loop/n+1
           when /dev/loop/n is opened.

        2. Accomodate bio_copy failures by reserving one page (or
           q->hardsect_size, whichever is greater. I have not done
           this yet because I want to at least make some changes to bio.c

        3. After the device mapper from lvm2 is integrated into 2.5,
           consider porting the "transform" functionality to it and
           seeing if we can the eliminate loop.c.

        If nobody identifies any glaring mistakes or test failures,
I expect to test it some more tomorrow and then I'd like to get
it blessed by the appropriate person (Jens?) and try to submit it
for 2.5.31. There is still more to do with loop.c, but I'd like to
sync up at this point while the patch is small enough to be somewhat
readable.

-- 
Adam J. Richter     __     ______________   575 Oroville Road
adam@yggdrasil.com     \ /                  Milpitas, California 95035
+1 408 309-6081         | g g d r a s i l   United States of America
                         "Free Software For The Rest Of Us."


- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Wed Aug 07 2002 - 22:00:26 EST