On Mon, Oct 06, 2014 at 02:18:19PM -0700, David Daney wrote:
As an alternative, if the space of possible instruction with a delay
slot is sufficiently small, all such instructions could be mapped as
immutable code in a shared mapping, each at a fixed offset in the
mapping. I suspect this would be borderline-impractical (multiple
megabytes?), but it is the cleanest solution otherwise.
Yes, there are 2^32 possible instructions. Each one is 4 bytes, plus you
need a way to exit after the instruction has executed, which would require
another instruction. So you would need 32GB of memory to hold all those
instructions, larger than the 32-bit virtual address space.
Plus errata support for some older CPUs requires no other instructions
that might cause an exception to be present in the same cache line inflating
the size to 32 bytes per instruction.
I've contemplated a full emulation - but that would require an emulator that
is capable of most of the instruction set. With all the random ASEs around
that would be hard to implement while the FPU emulator trampoline as currently
used has the advantage of automatically supporting ASEs, known and unknown.
So it's a huge bonus for maintenance.