Community Articles

via Decrypt · By Decrypt Editorial

Google's DiffusionGemma AI Hits 1,000 Tokens Per Second—And It's Free

DE
Decrypt Editorial
(10:01 PM UTC)
1 min read
JM
Approved byJames Mitchell
1484 views
0 comments

In brief

  • Google released DiffusionGemma, a free open-weight model that generates entire 256-token blocks simultaneously via text diffusion—hitting over 1,000 tokens per second on an NVIDIA H100, four times faster than standard autoregressive models.
  • The custom drafter module DiffusionGemma needs for local inference doesn't exist in any public runtime yet—not in mlx-lm, not in LM Studio—making it…

COINOTAG does not provide financial advisory services. This content is for informational purposes only and should not be considered investment advice. Cryptocurrency investments involve high risk.

Add COINOTAG as a Preferred Source

Add COINOTAG to your preferred sources in Google News and Search to see our coverage first.

Add on Google

Source

Decrypt Editorial · Decrypt

Read original →

Comments
Comments
Other Community Articles