Google's agent strategy spans consumer (Gemini app, Mariner), developer (Antigravity IDE, Vertex AI Agent Builder), and platform (Gemini API + function calling). This post is the practical lay of the land. Read the overview first for the conceptual foundation.
The Google agent stack
- Project Mariner — Google's browser-controlling agent. Available in Gemini Advanced / AI Pro / Ultra tiers.
- Antigravity — Google's agentic IDE with Gemini at the center. Multi-file code edits, command execution, browser automation in-IDE. See our Antigravity post.
- Vertex AI Agent Builder — the cloud platform for building production agents with state management, tool registry, evaluation. Enterprise-grade.
- Gemini API + function calling — the developer primitive for programmatic agents.
- Gemini Live / multimodal voice agents — real-time voice agents with vision and screen-share.
Project Mariner: the browser agent
Mariner is Google's answer to ChatGPT Operator: a browser-controlling agent that opens tabs, navigates, fills forms, and reports results. Available in Gemini Advanced subscriptions.
Mariner's strengths:
- Tight Chrome integration — it's literally running Chrome under the hood, so sites it visits behave normally.
- Vision-strong — Gemini's native multimodal nature means screenshot reasoning is excellent.
- Long-running tasks — sustained browser sessions over many steps.
- Google service integration — can pivot from web to Gmail, Drive, Calendar without context switching.
Tradeoffs: Mariner is consumer-product positioned. For programmatic browser agents you'd typically use Playwright + Vertex Agent Builder, not Mariner directly.
Antigravity: the IDE agent
For development work, Antigravity is Google's Claude Code analog. A desktop IDE (VS Code fork) with Gemini-powered agent at the center. Multi-file edits, command execution, browser tools, Workspace integration.
For an honest comparison vs Claude Code, see our Antigravity post. Summary: solid product, especially strong in Google-ecosystem workflows; for iOS-specific dev work Claude Code is still the leader.
Vertex AI Agent Builder
The production-grade Google Cloud service for building, deploying, and managing agents. Handles: agent definition, tool registration, conversation state, deployment, evaluation, monitoring, multi-region availability.
Use Vertex Agent Builder when:
- You're building an agent that needs enterprise SLAs.
- You're already on GCP and want unified billing.
- You need compliance posture Google provides (HIPAA, certain regional regimes).
- You need to compare multiple model versions via evaluation tools.
Gemini API + function calling
The developer primitive. Provide function schemas; Gemini calls them in response to user requests; you execute them; loop.
import { GoogleGenerativeAI } from '@google/generative-ai';
const genAI = new GoogleGenerativeAI(process.env.GOOGLE_API_KEY);
const tools = [{
functionDeclarations: [
{ name: 'read_file', parameters: { type: 'OBJECT', properties: { path: { type: 'STRING' } }, required: ['path'] } }
]
}];
const model = genAI.getGenerativeModel({ model: 'gemini-2.5-pro', tools });
const chat = model.startChat();
async function runAgent(goal) {
let response = await chat.sendMessage(goal);
while (true) {
const calls = response.functionCalls();
if (!calls || calls.length === 0) return response.text();
const results = [];
for (const c of calls) {
const r = await executeTool(c.name, c.args);
results.push({ functionResponse: { name: c.name, response: r } });
}
response = await chat.sendMessage(results);
}
}
Pattern is similar to Anthropic / OpenAI — agents are agents. The differences are in tool ecosystem, model behavior, and platform integration.
Where Gemini agents win
- Long context. Gemini's 1M-2M token context is unmatched. Agents reasoning over very large codebases or document corpora benefit.
- Multimodal-native. Vision and video reasoning are first-class, not bolted on. Agents that work with screenshots, video, design files are stronger here.
- Google Workspace. Agents that span Gmail, Calendar, Drive, Sheets are most natural here.
- Cost at scale. Gemini Flash is one of the cheapest capable models — agents that loop many times benefit.
- Vertex platform. Enterprise-grade infrastructure with strong eval tooling.
Where Gemini agents trail
- Tool ecosystem. Smaller than MCP (Anthropic) or the OpenAI plugin ecosystem.
- Developer-IDE story. Antigravity is solid but newer than Claude Code; less polish, smaller community.
- iOS workflow. Weaker than Claude's for native iOS development.
- Vendor stability concerns. Google's product-deprecation reputation continues to weigh on developer confidence.
Best use cases for Gemini agents
- Workspace-spanning workflows. "Read this thread of emails, summarize the asks, schedule follow-up meetings."
- Long-document reasoning. Legal review, large codebase navigation, scientific paper synthesis.
- Multimodal tasks. Video analysis, design QA, accessibility audits of screenshots.
- Enterprise GCP-native agents. Bound into your existing GCP stack.
- High-volume, cost-sensitive agents. Using Gemini Flash for cheap inference.
Getting started with Gemini agents
- Try Mariner if you have Gemini Advanced — assign a small web task.
- Try Antigravity on a non-critical project. See our Antigravity post.
- For programmatic agents: get a Google AI Studio API key, set up the Gemini SDK, build a 2-tool minimal agent.
- For production agents: use Vertex AI Agent Builder. Manage agent definitions in the console, deploy via API.
- Compare vs Claude / OpenAI for your specific task. Don't assume one platform is right for everything.
See also: Agentic AI Overview, Google Antigravity, Google Gemini, GCP Deep Dive.
- Google — Google AI for Developers
- Google — Vertex AI Agent Builder