Google unveils cost-efficient AI model Gemini 3.1 Flash-Lite

Alphabet’s (GOOG) (GOOGL) Google unveiled Gemini 3.1 Flash-Lite, its fastest and most cost-efficient Gemini 3 series model.

The company said that starting on Tuesday, 3.1 Flash-Lite is rolling out in preview to developers via the Gemini API in Google AI Studio and for enterprises via Vertex AI.

The model is priced at $0.25/1M input tokens and $1.50/1M output tokens, according to the company.

Google said that Flash-Lite delivers enhanced performance at a fraction of the cost of larger models. It outperforms 2.5 Flash with a 2.5 times faster Time to First Answer Token and a 45% increase in output speed, according to the Artificial Analysis benchmark, while maintaining similar or better quality.

The tech giant added that the model achieves an Elo score of 1432 on the Arena.ai Leaderboard and outperforms other models of similar tier across reasoning and multimodal understanding benchmarks, including 86.9% on GPQA Diamond and 76.8% on MMMU Pro – even surpassing larger Gemini models from prior generations like 2.5 Flash.

Early-access developers on AI Studio and Vertex AI and companies like Latitude, Cartwheel, and Whering are already using the new model to solve complex problems at scale, according to Google.

Leave a Reply Cancel reply