A deep dive into speculative decoding — how draft models, EAGLE, Medusa, and lookahead decoding speed up LLM inference without changing the model itself.