Jun 18, 2025 11:13:00

Google releases Gemini 2.5 Pro/Flash to the public and newly releases the cheapest and fastest 'Gemini 2.5 Flash-Lite' preview version

Google has expanded its Gemini 2.5 model family, publicly releasing Gemini 2.5 Flash and Gemini 2.5 Pro , and also announced a preview of Gemini 2.5 Flash-Lite, the most cost-effective and fastest Gemini 2.5 model to date.

Gemini 2.5 model family expansions

https://blog.google/products/gemini/gemini-2-5-model-family-expands/

Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities.
(PDF file) https://storage.googleapis.com/deepmind-media/gemini/gemini_v2_5_report.pdf

Gemini 2.5 is Google's family of hybrid inference models designed to deliver superior performance while remaining at the Pareto-optimal frontier of cost and speed. Google is now announcing the general availability of Gemini 2.5 Pro and Gemini 2.5 Flash as stable versions following user feedback.

Gemini 2.5 Pro, previewed alongside the Gemini 2.5 family announcement, is the most powerful model Google has ever developed, delivering high levels of coding and inference, as well as multimodal understanding capable of processing up to three hours of video content.

Google announces next-generation inference AI model 'Gemini 2.5', significantly improving inference and coding performance - GIGAZINE

Gemini 2.5 Pro has been released as a preview version, and an enhanced version, Gemini 2.5 Pro Preview (I/O edition), was released in May 2025. While the I/O edition improved coding performance, it was pointed out that other performance aspects were degraded, and Google's development team promised users that they would 'fix' them. This official version reflects these fixes.

Google releases early access version of AI model 'Gemini 2.5 Pro Preview (I/O edition)' with enhanced coding capabilities - GIGAZINE

With the official launch of Gemini 2.5 Pro, it is now available on the Gemini smartphone app and is also available for limited access for free plan users. The input price is $1.25 (approximately ¥180) per million tokens, and the output price is $10.00 (approximately ¥1450) per million tokens.

Gemini 2.5 Flash, announced in April 2025, is an inference model with reduced compute and latency requirements, and like Gemini 2.5 Pro, is built as a native multi-modal model supporting long contextual inputs of over 1 million tokens, including text, audio, images, video, and entire code repositories.

Google announces 'Gemini 2.5 Flash,' claiming it's more cost-effective than OpenAI's 'o4-mini' - GIGAZINE

Gemini 2.5 Flash has an input price of $0.30 (approximately ¥44) per million tokens and an output price of $2.50 (approximately ¥360) per million tokens, and is available on Google AI Studio and Vertex AI. Gemini 2.5 Flash can also be accessed through the Gemini app.

Additionally, Google has announced a preview of a new model, Gemini 2.5 Flash-Lite.

Gemini 2.5 Flash-Lite delivers overall higher quality than Gemini 2.0 Flash-Lite in coding, math, science, reasoning, and multimodal benchmarks. It performs particularly well on high-volume, latency-sensitive tasks like translation and classification, achieving lower latency than Gemini 2.0 Flash-Lite and Gemini 2.0 Flash across a wide range of prompt samples. However, Thinking mode is turned off by default and can be enabled via an API parameter.

Like other Gemini 2.5 models, Gemini 2.5 Flash-Lite offers the ability to turn on thinking at different budgets, connectivity to tools like Google Search and code execution, multimodal input, and a context length of 1 million tokens. Google stated that the goal of Gemini 2.5 Flash-Lite is to 'provide an economical model class that offers ultra-low latency capabilities and high throughput per price point.'

The table below summarizes the API usage prices and benchmark results for the Gemini 2.5 family, including Gemini 2.5 Flash-Lite. The input price for Gemini 2.5 Flash-Lite is $0.10 (approximately ¥15) per million tokens, and the output price is $0.40 (approximately ¥58) per million tokens.

The Gemini 2.5 Flash-Lite preview is currently available in Google AI Studio and Vertex AI , alongside the stable versions of Gemini 2.5 Flash and Pro. Google Search will also introduce custom versions of Gemini 2.5 Flash-Lite and Gemini 2.5 Flash for its AI-powered overview and AI mode .

Related Posts:

Jun 18, 2025 11:13:00 in AI, Software, Posted by log1i_yk