Google updates Gemini AI to write and run code for image analysis
7 days ago • agentic-ai
Google announced Agentic Vision for Gemini 3 Flash on January 27, 2026, turning static image analysis into an iterative agentic process [1]. The model uses a "Think, Act, Observe" loop. It plans multi-step analysis (Think), writes and executes Python code to crop, zoom, annotate, or compute on images (Act), and appends results to its context for review (Observe) before producing a final response [1][2][3].
This design ties reasoning to concrete visual evidence and shifts tasks from probabilistic guessing to deterministic execution. That reduces hallucinations in tasks such as counting digits and parsing dense tables [2][4]. Developers enable Agentic Vision by configuring the code_execution tool in Gemini API calls; the tool supports image URIs and visual scratchpads [1].
Agentic Vision is available now via the Gemini API in Google AI Studio and Vertex AI, with rollout to the Gemini app. Google reports a consistent 5–10% quality boost across most vision benchmarks and a 5% accuracy gain in PlanCheckSolver.com [1][3][4].
Why It Matters
- ML engineers can enable code_execution in the Gemini API to gain 5–10% on vision benchmarks without building custom agents.
- IT teams can deploy iterative visual reasoning at scale via Vertex AI and Google AI Studio, lowering time to production for vision apps.
- Deterministic, Python-based image operations reduce hallucination risk in multimodal tasks like counting and table parsing.
- Built-in Think-Act-Observe loops accelerate prototyping and let teams ship vision features in hours rather than weeks.
Trust & Verification
Source List (4)
Sources
- Google BlogOfficialJan 27, 2026
- InfoWorldTier-1Jan 27, 2026
- 9to5GoogleTier-1Jan 27, 2026
- Business TodayOtherJan 28, 2026
Fact Checks (4)
Google announced Agentic Vision for Gemini 3 Flash on January 27, 2026 (VERIFIED)
Uses Think-Act-Observe loop with Python code execution for iterative visual reasoning (VERIFIED)
Delivers 5-10% quality boost across most vision benchmarks (VERIFIED)
Available via Gemini API, Google AI Studio, and Vertex AI (VERIFIED)