
technology
Google’s Gemma 4 open AI models use “speculative decoding” to get up to 3x faster - Ars Technica
> Google's Gemma 4 open AI models introduce speculative decoding, a technique that delivers up to 3x faster inference without degrading output quality. This matters because it directly challenges the ...