Google updates Gemini 2.5 LLM series with new entry-level model, pricing changes – SiliconANGLE
by
Google LLC today introduced a new large language model, Gemini 2.5 Flash-Lite, that can process prompts faster and more cost-efficiently than its predecessor.
The algorithm is rolling out as part of a broader update to the company’s flagship Gemini 2.5 LLM series. The two existing models in the lineup, Gemini 2.5 Flash and Gemini 2.5 Pro, have moved from preview to general availability. The latter algorithm also received several pricing changes.
Gemini 2.5 made its original debut in March. The LLMs in the series are based on a mixture-of-experts architecture, which means that they each comprise multiple neural networks. When a user submits a prompt, Gemini 2.5 activates only one of the neural networks rather than all of them, which lowers hardware usage.
The LLM series is the first that Google trained using its internally developed TPUv5p AI chip. According to the company, the training processing involved multiple server clusters that each contained 8,960 TPUv5p chips. Google’s researchers equipped the clusters with new software that can automatically mitigate some technical issues.
Gemini 2.5 models are multimodal with support for up to 1 million tokens per prompt. Google describes the flagship algorithm in the series, Gemini 2.5 Pro, as its most capable LLM to date. During internal tests, it outperformed OpenAI’s o3-mini across a range of math and coding benchmarks.
Gemini 2.5 Flash, the model that moved into general availability today together with Gemini 2.5 Pro, trades off some performance for efficiency. It responds to prompts faster and incurs lower inference costs. Gemini 2.5 Flash-Lite, the new model that Google debuted today, is an even more efficient model that is positioned as the new entry-level model in the LLM series.
“2.5 Flash Lite has all-around higher quality than 2.0 Flash-Lite on coding, math, science, reasoning and multimodal benchmarks,” Tulsee Doshi, senior director of product management for Gemini, detailed in a blog post. “It excels at high-volume, latency-sensitive tasks like translation and classification, with lower latency than 2.0 Flash-Lite and 2.0 Flash on a broad sample of prompts.”
Gemini 2.5 Flash-Lite is billed at a rate of 10 cents per 1 million input tokens when developers submit prompts that contain text, images or video. That’s less than one-10th the cost of Gemini 2.5 Pro. The price per million tokens of output, in turn, is 40 cents compared with $10 for Gemini 2.5 Pro.
Google is changing the pricing of its mid-range Gemini 2.5 Flash model as part of the update. The company will now charge 30 cents per million input tokens and $2.50per 1 million output tokens compared with 15 cents and $3.50, respectively, before. Additionally, there is no longer separate pricing for tokens that the model processes in “thinking mode.” The mode allows the LLM to boost output quality by increasing the amount of time and compute resources that it uses to generate prompt responses.
THANK YOU
INKY warns of new QR code phishing tactic using embedded JavaScript
Pure Storage launches unified data management cloud and new flash arrays
Base44 joins Wix in $80M deal to support natural language software development
Domino Data Lab introduces new AI governance, data management features
Mixpanel expands analytics platform with metric trees, AI insights and experimentation tools
Trend Micro debuts data center appliances with built-in cybersecurity software
INKY warns of new QR code phishing tactic using embedded JavaScript
SECURITY – BY . 1 MIN AGO
Pure Storage launches unified data management cloud and new flash arrays
CLOUD – BY . 1 HOUR AGO
Base44 joins Wix in $80M deal to support natural language software development
AI – BY . 1 HOUR AGO
Domino Data Lab introduces new AI governance, data management features
AI – BY . 1 HOUR AGO
Mixpanel expands analytics platform with metric trees, AI insights and experimentation tools
CLOUD – BY . 2 HOURS AGO
Trend Micro debuts data center appliances with built-in cybersecurity software
INFRA – BY . 2 HOURS AGO
Forgot Password?
Like Free Content? Subscribe to follow.