TL;DR
Google is rolling out a "Select from screen" tool for Gemini in Chrome that lets users highlight on-screen content to focus their AI prompts, while also releasing Gemini 3.5 Flash with computer-use capabilities to developers. This dual launch marks a significant step toward making AI assistants both more context-aware and more autonomous, directly competing with similar moves from OpenAI and Anthropic.
What Happened
Google announced on Wednesday, June 24, 2026, that Gemini in Chrome is gaining a new "Select from screen" tool that allows users to visually highlight any portion of their browser window to focus an AI prompt, while simultaneously releasing Gemini 3.5 Flash with native computer-use capabilities to developers. The two updates, detailed by 9to5Google, represent a coordinated push to make Google's AI assistant more visually interactive and operationally autonomous.
Key Facts
- The "Select from screen" tool lets Chrome users drag a selection box over any on-screen content — text, images, or interface elements — to narrow the context of a Gemini prompt.
- Gemini 3.5 Flash is Google's fastest model tier, and the new computer-use capability enables it to navigate interfaces, click buttons, and fill forms autonomously based on natural language instructions.
- The computer-use feature is launching first for developers via the Gemini API, with no public availability date announced for end users.
- This update follows OpenAI's GPT-4o with computer use (released March 2026) and Anthropic's Claude with computer control (December 2025) — placing Google in a direct competitive race for autonomous AI agents.
- The "Select from screen" tool works across all Chrome tabs and can be triggered from the Gemini sidebar or via a keyboard shortcut (Ctrl+Shift+E on Windows, Cmd+Shift+E on Mac).
- Google has not disclosed whether the screen selection data is processed locally or on cloud servers, raising immediate privacy questions for enterprise and consumer users.
- The company confirmed the feature will roll out to Chrome Stable over the next two weeks, with Chrome Enterprise and Education versions following in July 2026.
Breaking It Down
The "Select from screen" tool solves a fundamental friction point in AI assistants: the inability to precisely communicate what the user is looking at. Previously, a user might type "explain this chart" or "summarize this paragraph" — but the AI had no reliable way to know which chart or paragraph they meant. By letting users visually draw a bounding box around the exact content of interest, Google eliminates that ambiguity entirely. The tool works across any web page, not just Google properties, making it a universal interaction layer for Chrome.
80% of Gemini prompt failures in early internal testing involved the AI misidentifying which on-screen element the user was referencing, according to a Google engineer's public post on the Chromium blog. The "Select from screen" tool reduced those errors to below 5%.
The computer-use capability in Gemini 3.5 Flash is the more consequential update. It moves Gemini from a passive assistant that answers questions to an active agent that performs tasks. A developer could instruct it to "log into the admin panel, download the weekly report, and email it to the team" — and the model would execute each step by manipulating the browser interface. This mirrors the architecture of Anthropic's Claude Computer Use (December 2025) and OpenAI's Operator (March 2026), both of which use screenshot-based vision models to parse and interact with on-screen elements. Google's advantage lies in its Chrome ecosystem: Gemini 3.5 Flash can leverage Google's deep integration with Chrome's accessibility APIs, potentially making its computer-use actions faster and more reliable than competitors that rely solely on pixel-level vision.
The timing is strategic. By releasing both tools simultaneously, Google positions Gemini as both a precision tool (select-from-screen for focused queries) and an automation engine (computer use for complex workflows). This dual approach targets two distinct user segments: knowledge workers who need faster, more accurate answers, and developers who want to build autonomous agents. The decision to restrict computer-use to developers initially suggests Google is cautious about unleashing autonomous AI on the open web, where a poorly designed agent could accidentally delete accounts, purchase items, or expose sensitive data.
What Comes Next
-
Public rollout of "Select from screen" will complete by July 8, 2026, for Chrome Stable users, with enterprise and education channels following by mid-July. Expect Google to monitor crash reports and privacy complaints closely during this initial phase.
-
Developer access to Gemini 3.5 Flash computer use will expand through the Gemini API in phased tiers — starting with trusted partners in late June, then general availability by August 2026. Pricing has not been announced, but analysts expect it to be per-action rather than per-token, given the computational cost of screenshot processing.
-
Privacy and security audits are likely to intensify. Google will need to publish a white paper on data handling for both features, especially regarding whether screen selection data is uploaded to servers or processed locally. The European Data Protection Board has already signaled interest in reviewing the feature under GDPR.
-
A consumer version of computer use could arrive in Chrome by late 2026, possibly branded as "Gemini Actions" or "Gemini Automate" — similar to how Google Assistant Routines evolved into scriptable automations for smart home devices.
The Bigger Picture
This dual launch sits at the intersection of two major trends: visual AI interaction and autonomous agents. The "Select from screen" tool represents a maturation of how humans communicate with AI — moving from text-only prompts to mixed-modality interactions that include visual references, gestures, and spatial context. It's a small step toward the kind of seamless interaction that Apple Intelligence and Samsung Galaxy AI have promised, but Google is the first to implement it directly in a browser with universal page support.
The autonomous agent race is now fully engaged. With Google, OpenAI, and Anthropic all offering computer-use capabilities, the battleground is shifting from who has the smartest model to who can build the most reliable, safe, and useful agent. Google's advantage is distribution: Chrome has over 3.2 billion active users globally. If Google can make computer-use agents work reliably in Chrome, it could leapfrog competitors who require users to install separate apps or browser extensions. However, that same scale makes safety failures catastrophic — a single widely publicized incident of a Gemini agent making an unauthorized purchase or deleting a user's data could set the entire field back years.
Key Takeaways
- Screen Selection Launch: Google's "Select from screen" tool for Gemini in Chrome lets users visually highlight content to focus AI prompts, rolling out over the next two weeks to all Chrome Stable users.
- Computer Use for Developers: Gemini 3.5 Flash now offers native computer-use capabilities via the API, enabling autonomous interface navigation — but only for developers initially.
- Competitive Positioning: Google is directly challenging OpenAI's GPT-4o and Anthropic's Claude in the autonomous agent space, leveraging Chrome's 3.2 billion users as a distribution advantage.
- Privacy Questions Remain: Neither feature's data-handling policies have been fully disclosed, with regulators in Europe already signaling scrutiny over screen capture and cloud processing.



