by curiosity I've tried to understand why mor1kx uses more logic than LM32 and already discuss of some points with Stefan last week.
To do the comparison, I'm using MiSoC on de0-nano and mixxeo board.
mor1kx and LM32 configurations have to be almost identical, so the attached patch for MiSoC has to be used:
- it disables the store buffer (thanks to the work done by Stefan last week on that).
- it disable burst on Wishbone
One other difference is that LM32 does not support signed division and it's not possible to disable it on mor1kx. By removing the logic specific to the signed part of the division, it saves ~130 LUTs on BaseSoC on Mixxeo.
With the upstream code of mor1kx we have for BaseSoC:
LM32: 2514 regs / 3068 luts
mor1kx: 2761 regs / 3937 luts
--> LM32 beats mor1kx by ~250 regs / 869-130 = ~740 luts.
I've done some others "experiments" on the code to try to reduce this difference, but before doing that I've done some clean up on the code to avoid mixing spaces and tabs
and removed trailing whitespaces, you can merge it... or not :)
- remove some unneeded resets on some alu registers (divider and multiplier). Since decode_valid_i already invalidates the data, there is no reason to reset the data registers.
- when PIC uses LEVEL interruptions, it's not necessary to registers IRQs.
- to my mind, some simplications can be done in mor1kx_ctrl_cappuccino.v to reduce logic.
which gives me:
mor1kx: 2771 regs / 3828 luts
+10 regs / - 110 luts compared to the upstream code.
timings are also better on Altera and Xilinx.
I have done very limited tests on the code (run MiSoC Bios with SDRAM test on the de0-nano).
With that LM32 still beats mor1kx by ~+250 regs /+ 600 luts... but it was +500 regs/ +1350 luts some days ago :)