Why Your ML Inference Pipeline Is Slower Than It Needs to Be
Most inference slowdowns aren't where you think. A practical guide to finding and fixing the real bottlenecks.
performanceml-systemsinference
Blog
Notes on performance engineering, AI systems, and lessons from building things that need to work.