Community Articles
via Decrypt · By Decrypt Editorial
Google's DiffusionGemma AI Hits 1,000 Tokens Per Second—And It's Free
DE
Decrypt Editorial(10:01 PM UTC)
1 min read
1484 views
0 commentsIn brief
- Google released DiffusionGemma, a free open-weight model that generates entire 256-token blocks simultaneously via text diffusion—hitting over 1,000 tokens per second on an NVIDIA H100, four times faster than standard autoregressive models.
- The custom drafter module DiffusionGemma needs for local inference doesn't exist in any public runtime yet—not in mlx-lm, not in LM Studio—making it…
COINOTAG does not provide financial advisory services. This content is for informational purposes only and should not be considered investment advice. Cryptocurrency investments involve high risk.
Add COINOTAG as a Preferred Source
Add COINOTAG to your preferred sources in Google News and Search to see our coverage first.
Add on GoogleComments
Comments
Other Community Articles