3/30/2025 at 7:02:49 PM
"Twice the speed of the GBC" is a bit misleading.Clock rate of the ARM7TDMI is indeed around double the GBC (GBA runs at 16.78Mz, while GBC runs at 8.4MHz), but cycles-per-instruction is far lower on the GBA's ARM7TDMI than the GBC's Z80-like processor.
On GBA, most instructions take 1 cycle to execute (when running from fast memory). Not all instructions take one cycle, memory Read/Write instructions, branches, and multiplying takes more than one cycle.
On GBC, an instruction basically takes 4 cycles per memory access. This includes the instruction fetch itself, each other byte of the instruction, each memory read/write performed, then 4 additional cycles if the instruction performed 16-bit math. (Also stuff for branches too)
But GBA doesn't always run code from fast memory. It gets the worst-case performance when executing code directly from the cartridge. When running 16-bit THUMB code, it takes 5 cycles. When running 32-bit ARM code, it takes 8 cycles. This means that a game needs to copy code into fast memory if it wants to run at a high performance.
So with the full penalties that come from directly executing code from the cartridge, and you're comparing the simplest instructions, it does end up being only twice as fast. But when running code from fast memory, it's around 16 times faster.
by Dwedit
3/31/2025 at 6:51:58 AM
The default setup on startup is 4 cycles for ROM access as you mention, but with system control registers you can control how fast the memory is - apparently current ROM has 3/1 wait states (first/sequential) access for 16 Bit access. There’s also a prefetch queue instructions. I understand this means when running THUMB code from ROM, non-branching instructions can also run at full speed.by ant6n