Discussion about this post

User's avatar
Dylan Patel's avatar

Benchmark performance by input type chart for H100, H200, MI300X pls. Make it something I gotta pay for lol. It would be cool to see FLOPS/Watt across all those HW too, cause rn all we see is some random MFU across a whole model which isnt apples to apple or theoretical peaks.

Expand full comment
Amit's avatar

Very nice article, Horace.

It’s something many of us have long suspected, but this is the first time I am seeing actual experimental results showing that peak flops are really just pie in the sky — even if memory bandwidth was infinite.

Expand full comment
15 more comments...

No posts