Re: Code optimization in sounddriver (Was: Naming conflict in sound drivers)

Oliver Xymoron (oxymoron@waste.org)
Thu, 24 Oct 1996 08:50:08 -0500 (CDT)


On 21 Oct 1996, Markus Gutschke wrote:

> DarrellAE@aol.com writes:
> > My limited C programming skills are puzzled also by:
> > dmasound.c:736: *p++ = get_user(userPtr++) ^ 0x80;
>
> This line
> 1) reads from user space the memory location that userPtr points to,
> 2) increments userPtr by one byte so that it points to the next data
> byte to be read,
> 3) flips the topmost bit, hereby converting from an unsigned number
> to a signed one,
> 4) stores the result to kernel space in the memory location that p
> points to,
> 5) increments p by one byte so that is points to the next data byte
> to be written.
>
> As this code is evaluated in a loop, it effectively copies a buffer
> from user space to kernel space while converting the data
> format. There is a more advanced variation of this code, a few lines
> down. It performs the same conversion for two bytes at a time.
>
> In a previous posting, Linus wrote that he discourages code that calls
> get_user or put_user in a tight loop. So, it might be possible to
> improve this copy/conversion loop by first calling copy_from_user for
> the entire data block and then performing the conversion in place. Any
> comments as to the expected performance improvement?

Doing the copy four or so bytes at a time should be significantly faster -
the bottleneck on old Linux kernels would have been memory bandwidth. On
the 2.1 series, with the restructuring of memory verification/exception
handling, separating this into two passes would probably work well,
especially if you operate on multiple bytes at a time. It might also help
to break the operation down into runs of 512 or 1k bytes that comfortably
fit in the cache, so that the multiple passes don't cause as many
additional accesses.

--
 "Love the dolphins," she advised him. "Write by W.A.S.T.E.."