Well, if you need that many bits... guess I would need to study the code (hard with binary attachments) to understand why you need so many bits.
gcc can do long long multiplies fine, but only with a long long result.
The code presented, however, needs (at least) 96 bits of the result,
which expressing in C would be far more complicated than doing it with a
couple of assembly statements.
~
I really do not see the relavence of the run time library patches
Which run time library patches are you referring to? NLKD's? If so,
these routines must not be used by code outside of the debugger (and the
opposite is true, too: debugger code must not use common code routines
where ever possible).
Further, it is my understanding that it is for a (unknown to me) reason
that the linux kernel doesn't have the full set of libgcc support
routines. Since the debugger in various places relies on being able to
do 64-bit math on 32-bit systems, I had to add these in a way so that
they'd be hidden from the rest of the kernel (and also so that they'd
satisfy the isolation rules outlined above).