Re: Large file i/o in 2.2.10 and 2.3.8 (which locks up!)

Anthony Barbachan (barbacha@Hinako.AMBusiness.com)
Thu, 24 Jun 1999 14:59:02 -0400 (EDT)


Does the same "attempt to acess beyond the end of the device" show up if
you run your program under 2.2.7. I think you ran into the file
corruption bug that has apparently appeared in the latest kernels. If
possible try running your program under 2.2.7 or lower.

On Thu, 24 Jun 1999, Dr Mark Hagger wrote:

> Hi,
>
> I've been experimenting with writing large amounts of data (in moderate
> size blocks) to disk followed by repeatedly reading it back in (again in
> blocks). (For those interested its because I'm contemplating doing a
> matrix-vector multiply in an out-of-core fashion).
>
> However, I've run into strange problems while benchmarking my approach.
> I've included the source code of this program for your perusal. I've run
> it on both 2.2.10 and 2.3.8 on my PII 400MHz (Redhat 5.2) machine, which
> has 256Mbytes RAM, an Ultra-wide SCSI bus and a 7000rpm SCSI drive.
>
> Basically the program generates a block of random data (16Mbytes) and
> writes this to a file on the disk 20 times. It then lseeks back to the
> beginning of the file and tries to read it back in, again in blocks of
> 16Mbytes.
>
> So, for 2.2.10:
>
> The writing proceeds quite well, getting bursts of 50Mbytes/s for most of
> the writes with periodic very slow writes (~1Mbytes/s) whilst
> (presumambly) quantities of data are fed to the disk. The readbacks are
> quite slow, a fairly consistant 13Mbytes/s. Although over time (ie
> multiple number of entire file reads) this seems to degrade somewhat,
> which in itself is odd.
>
> Now, for 2.3.8:
>
> I was a little surprised at how slow the writing to the disk goes, and
> indeed how much variation there was with write speeds of each block
> (between 3Mbytes/s and 9Mbytes/s). I guess the main reason for this is to
> do with changes to the memory buffering of disk writes, and the general
> "bursty" nature of this process.
>
> Now, when the program gets to the read stage the disk goes mad and hammers
> away like crazy, reading apparently varies between 1 and 13Mbytes/s.
> When I ran this for a while (ie looped over the entire file read a number
> of times) the program eventually Seg faulted and crashed, and actually
> took the machine down with it, locking it up solid. I had to poweroff and
> reboot.
>
> On examining the messages file I saw a vast number of messages which read:
>
> kernel: attempt to access beyond end of device
> kernel: 08:09: rw=0, want=1072687759, limit=6827593
>
> (we're talking 100's of these for anyone interested). Examining this
> again I saw that these messages poured out during the first (block) read
> and then continued to pour out during subsequent reads, up until the time
> the machine locks up.
>
> This all seems a little odd to me, (as ever I can't ignore the fact that
> I've done something silly! But the read/write/lseek don't generate any
> error messages whilst this is happening).
>
> In general though, changes to the block size seems to make quite
> considerable differences to the performance, larger than 16Mbytes block
> size causes things to slow down considerably, although smaller block sizes
> seem to have effects as well. Possibly someone out there can tell me the
> reason for this! (or indeed any thoughts on how to make this type of
> process run fast would be useful).
>
> Mark
>
>
> Here's the program:
>
>
> /* test a large (blocked) file write followed by a series of
> repeated readbacks */
>
> #include <stdio.h>
> #include <math.h>
> #include <stdlib.h>
>
> #include <sys/types.h>
> #include <sys/stat.h>
> #include <fcntl.h>
> #include <unistd.h>
> #include <sys/time.h>
> #include <errno.h>
> #include <string.h>
>
> int main() {
> double wtime;
>
> int nblocks=16;
> int bsize=1000;
> int nsize=2000;
> double *matrix;
> struct timeval tim;
> struct timezone tz;
> int i,j,fid;
> ssize_t nread;
>
>
> matrix=malloc(bsize*nsize*sizeof(double));
>
> fid=open("/local/block_file",O_CREAT | O_RDWR,S_IRWXU );
> for(i=0;i<nblocks;i++) {
> /*create random numbers */
> for(j=1;j<bsize*nsize;j++)
> matrix[j]=drand48();
> /* and write the matrix to the file */
> printf("Writing block %d to file\n",i);
> fflush(stdout);
>
>
> gettimeofday(&tim,&tz);
> wtime=-tim.tv_sec-tim.tv_usec/1.0e+6;
>
> if(write(fid,matrix,sizeof(double)*nsize*bsize)<0) {
> printf("write error: %s\n",strerror(errno));
> }
>
> gettimeofday(&tim,&tz);
> wtime+=tim.tv_sec+tim.tv_usec/1.0e+6;
>
> printf("Time to write = %le (%7.3lf Mbytes/s)\n",wtime,sizeof(double)*bsize*nsize/(wtime*1.0e+6));
> fflush(stdout);
> }
>
>
> /*now try reading it back a number of times */
>
> for(j=0;j<5;j++) {
> off_t offset;
> offset=0;
> if(lseek(fid,offset,SEEK_SET)<0) {
> printf("lseek error: %s\n",strerror(errno));
> }
> for(i=0;i<nblocks;i++) {
> printf("Reading block %d from file\n",i);
> fflush(stdout);
>
> gettimeofday(&tim,&tz);
> wtime=-tim.tv_sec-tim.tv_usec/1.0e+6;
>
> nread=read(fid,matrix,sizeof(double)*nsize*bsize);
> if(nread<0) {
> printf("read error: %s\n",strerror(errno));
> }
> if(nread<sizeof(double)*nsize*bsize) {
> printf("read wierdness %d read\n",nread);
> }
>
> gettimeofday(&tim,&tz);
> wtime+=tim.tv_sec+tim.tv_usec/1.0e+6;
>
> printf("Time to read = %le (%7.3lf Mbytes/s)\n",wtime,sizeof(double)*bsize*nsize/(wtime*1.0e+6));
> fflush(stdout);
> }
> }
>
>
>
> free(matrix);
> close(fid);
> return 0;
> }
>
>
> -- END PROGRAM --
>
> --
>
> Mark Hagger Tel: 01305 212803
> DERA Winfrith, Winfrith Technology Centre Fax: 01305 212103
> Winfrith, Dorset, DT2 8XJ, UK Email: mhagger@dera.gov.uk
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.rutgers.edu
> Please read the FAQ at http://www.tux.org/lkml/
>

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/