Re: Patch to loop device (loop.c)

rhp@draper.net
Thu, 3 Dec 1998 11:00:48 -0600


Greetings Ted, Alexander and Ingo,

Ted, as the original author of the Loop device driver and e2fsck, and
ll_rw_blk guru, I am hopeful that you can clear up a lingering concern
(or set my feet on the path to a more correct understanding).

I had recently submitted a change to loop.c (implemented in the
2.1.130 kernel) which passes along block numbers to loop transfer
functions. The intent was to strengthen CBC mode encryption with
unique IV's derived from block numbers.

Ingo raises a concern:
>
> The reason is that 2.1.130 contains an update to pass the
> block number to the transfer function (which is a good thing),
> but doesn't make sure that blocks are really only accessed as a
> whole. (If you try to start to read a block in the middle of it
> you won't be able to decrypt it if the whole block is encrypted
> with CBC mode).
>

Please check my understanding and correct me where I am in error:

<logic check>

Loop I/O requires two ll_rw_blk requests:
one invokes the device driver supporting the real (backing) device and
the other invokes loop.c.

In the case of writes, loop.c always receives a complete block by definition.

In the case of reads, should the real (backing) request (which plays prior
to the loop.c request) fail to return all of the required sectors then
end_that_request_first() invokes printk() advising that an error has
occurred. Thus loop.c is assured never to see incomplete blocks. Correct?

</logic check>

Best Regards,
Reed H. Petty <rhp@draper.net>

On Thu, Dec 03, 1998 at 02:08:56AM +0000, rohloff@informatik.tu-muenchen.de wrote:
> Hi Alexander (and all others who are reading this),
>
> I've got a new loop patch which should be a lot cleaner when
> you access a file via the loop device. It uses the file ops of
> the file to access it. (It applies to 2.1.129, why see below)
>
> Still some things don't work:
> First of all, I found out that the "create_missing_block" function
> doesn't seem to work, at least it didn't work on my system.
> (If you tried to create a file system with 1024 blocks (1meg) on
> a file which was only 1 Byte long it didn't work...)
>
> Since I think that the way the patch does it should make
> create_missing_block obsolete, because the code does really
> the same, I deleted the function completely.
>
> But the handling of sparse files still doesn't work correctly
> (I will have a further look...)
>
> Then I found out that access to your harddisk slows down to a crawl,
> if you mount a block device via the loop device (unencrypted)
>
> (for example
>
> losetup /dev/loop0 /dev/hda1
> mount /dev/loop0 /mnt)
>
> )
>
> and run "bonnie" on it (bonnie is a program for measuring the
> performance of filesystem accesses). It happens when bonnie
> starts its "rewriting" phase.
>
> Obviously this is a buffer/cache issue, somehow something doesn't
> work as expected...
> The code snippet used for accessing block devices is:
> (This is executed for each block which is transferred.
> I hope the code explains itself ?)
> ----------------------
>
> bh = getblk(lo->lo_device, block, blksize);
> if (!bh) {
> printk(KERN_ERR "loop: device %s: getblk(-, %d, %d) returned NULL",
> kdevname(lo->lo_device),block, blksize);
> goto error_out_lock;
> }
> if (!buffer_uptodate(bh) && ((current_request->cmd == READ) ||
> (offset || (len < blksize)))) {
> ll_rw_block(READ, 1, &bh);
> wait_on_buffer(bh);
> if (!buffer_uptodate(bh)) {
> brelse(bh);
> goto error_out_lock;
> }
> }
>
> if ((lo->transfer)(lo, current_request->cmd, bh->b_data + offset,
> dest_addr, size)) {
> printk(KERN_ERR "loop: transfer error block %d\n", block);
> brelse(bh);
> goto error_out_lock;
> }
>
> if (current_request->cmd == WRITE) {
> mark_buffer_uptodate(bh, 1);
> mark_buffer_dirty(bh, 1);
> }
> brelse(bh);
>
> ----------------------------
>
> Can someone of the buffer/cache gurus give me a hint how to improve
> that behaviour ?
>
> If you mount a file with the loop device than there is of course also
> a performance loss, but it doesn't slow down to a crawl.
>
> (I think blocks get cached twice, first when some upper layer accesses
> the loop device via ll_rw_block, and then the second time when
> the loop device accesses the real block device. How can you avoid
> this double caching ?)
>
>
> Why is this patch against 2.1.129 ?
> The reason is that 2.1.130 contains an update to pass the
> block number to the transfer function (which is a good thing),
> but doesn't make sure that blocks are really only accessed as a
> whole. (If you try to start to read a block in the middle of it
> you won't be able to decrypt it if the whole block is encrypted
> with CBC mode).
>
>
> A safe (and efficient) way to do it would be to pass the _sector_
> number, because as far as I can tell an access to a block device
> never transfers partial sectors.
> (I'm not sure if this is also true for blocks. If it is true
> then it would be possible to make the code simpler because all
> the partial sector reading stuff could be deleted.)
>
> so long
> Ingo
>

> --- loop.c Mon Nov 30 23:56:02 1998
> +++ loop_new.c Thu Dec 3 02:00:46 1998
> @@ -59,10 +59,6 @@
> #define FALSE 0
> #define TRUE (!FALSE)
>
> -/* Forward declaration of function to create missing blocks in the
> - backing file (can happen if the backing file is sparse) */
> -static int create_missing_block(struct loop_device *lo, int block, int blksize);
> -
> /*
> * Transfer functions
> */
> @@ -148,12 +144,16 @@
>
> static void do_lo_request(void)
> {
> - int real_block, block, offset, len, blksize, size;
> - char *dest_addr;
> + int block, offset, len, blksize, size;
> + char *dest_addr,block_buf[4096];
> struct loop_device *lo;
> struct buffer_head *bh;
> struct request *current_request;
> - int block_present;
> + struct file* file;
> + int retval;
> + loff_t f_offset;
> + mm_segment_t old_fs;
> + struct inode *inode;
>
> repeat:
> INIT_REQUEST;
> @@ -203,31 +203,64 @@
> if (size > len)
> size = len;
>
> - real_block = block;
> - block_present = TRUE;
> -
> - if (lo->lo_flags & LO_FLAGS_DO_BMAP) {
> - real_block = bmap(lo->lo_dentry->d_inode, block);
> - if (!real_block) {
> -
> - /* The backing file is a sparse file and this block
> - doesn't exist. If reading, return zeros. If
> - writing, force the underlying FS to create
> - the block */
> - if (current_request->cmd == READ) {
> - memset(dest_addr, 0, size);
> - block_present = FALSE;
> - } else {
> - if (!create_missing_block(lo, block, blksize)) {
> - goto error_out_lock;
> - }
> + if (lo->lo_backing_file) { /* If we access a file use file ops to */
> + /* access the data */
> + f_offset = block * blksize + offset;
> + file = lo->lo_backing_file;
> +
> + if (file->f_op == NULL) {
> + printk(KERN_WARNING "loop: no file ops\n");
> + goto error_out_lock;
> + }
> +
> + if (file->f_op->llseek != NULL) {
> + file->f_op->llseek(file, f_offset, 0);
> + }
> + else {
> + /* Do what the default llseek() code would have done */
> + file->f_pos = f_offset;
> + file->f_reada = 0;
> + file->f_version = ++event; /* What's this ??? */
> + }
> +
> + if (current_request->cmd==READ) {
> + old_fs = get_fs();
> + set_fs(get_ds());
> +
> + retval=file->f_op->read(file,block_buf,size,&file->f_pos);
> +
> + set_fs(old_fs);
> +
> + if (retval < 0) {
> + printk(KERN_ERR "loop: cannot read block!");
> + goto error_out_lock;
> }
> -
> }
> - }
> -
> - if (block_present) {
> - bh = getblk(lo->lo_device, real_block, blksize);
> +
> + if ((lo->transfer)(lo, current_request->cmd, block_buf,
> + dest_addr, size)) {
> + printk(KERN_ERR "loop: transfer error block %d\n", block);
> + goto error_out_lock;
> + }
> +
> + if (current_request->cmd==WRITE) {
> + old_fs = get_fs();
> + set_fs(get_ds());
> + inode = file->f_dentry->d_inode;
> + down(&inode->i_sem);
> +
> + retval = file->f_op->write(file, block_buf, size, &file->f_pos);
> +
> + up(&inode->i_sem);
> + set_fs(old_fs);
> +
> + if (retval < 0) {
> + printk(KERN_ERR "loop: cannot write block!");
> + goto error_out_lock;
> + }
> + }
> + } else { /* It is a block device, so use block driver routines */
> + bh = getblk(lo->lo_device, block, blksize);
> if (!bh) {
> printk(KERN_ERR "loop: device %s: getblk(-, %d, %d) returned NULL",
> kdevname(lo->lo_device),
> @@ -274,61 +307,6 @@
> CURRENT=current_request;
> end_request(0);
> goto repeat;
> -}
> -
> -static int create_missing_block(struct loop_device *lo, int block, int blksize)
> -{
> - struct file *file;
> - loff_t new_offset;
> - char zero_buf[1] = { 0 };
> - ssize_t retval;
> - mm_segment_t old_fs;
> - struct inode *inode;
> -
> - file = lo->lo_backing_file;
> - if (file == NULL) {
> - printk(KERN_WARNING "loop: cannot create block - no backing file\n");
> - return FALSE;
> - }
> -
> - if (file->f_op == NULL) {
> - printk(KERN_WARNING "loop: cannot create block - no file ops\n");
> - return FALSE;
> - }
> -
> - new_offset = block * blksize;
> -
> - if (file->f_op->llseek != NULL) {
> - file->f_op->llseek(file, new_offset, 0);
> - } else {
> - /* Do what the default llseek() code would have done */
> - file->f_pos = new_offset;
> - file->f_reada = 0;
> - file->f_version = ++event;
> - }
> -
> - if (file->f_op->write == NULL) {
> - printk(KERN_WARNING "loop: cannot create block - file not writeable\n");
> - return FALSE;
> - }
> -
> - old_fs = get_fs();
> - set_fs(get_ds());
> -
> - inode = file->f_dentry->d_inode;
> - down(&inode->i_sem);
> - retval = file->f_op->write(file, zero_buf, 1, &file->f_pos);
> - up(&inode->i_sem);
> -
> - set_fs(old_fs);
> -
> - if (retval < 0) {
> - printk(KERN_WARNING "loop: cannot create block - FS write failed: code %d\n",
> - retval);
> - return FALSE;
> - } else {
> - return TRUE;
> - }
> }
>
> static int loop_set_fd(struct loop_device *lo, kdev_t dev, unsigned int arg)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/