Post History

Current version by Nick Antonaccio

Current VersionMay 23, 2026 at 03:35

BTW, qwen36:35a3 q4_k_s MTP runs at 59 tps on the DGX Spark and 54 tps on Strix Halo. The 8 bit version even runs at 40tps on the DGX Spark.

Holy crap that's quick for a very capable model, on a single piece of consumer hardware at low wattage.

Previous Versions
Version 1May 23, 2026 at 03:35

BTW, qwen36:35a3 q4_k_s MTP runs at 59 tps on the DGX Spark. The 8 bit version even runs at 40tps.

Holy crap that's quick for a very capable model, on a single piece of consumer hardware at low wattage.