265 tokens/second at 30 watts. Purpose-built transformer inference silicon that outperforms the H100 at 1/23rd the power.
4,850 lines. Zero unnecessary code. An operating system designed around the hardware, not adapted to it.