Google is building a feature that would let Gemini operate third-party Android apps on your behalf, handling tasks like hailing an Uber or ordering food by interacting directly with app interfaces. The capability, codenamed "bonobo," surfaced in Google app version 17.4 beta this week.
How it works
The feature uses what Google calls "screen automation" to let Gemini see and tap through apps just like a human would. Tell it to book a ride home, and the assistant would open your ride-hailing app, enter the destination, and confirm the request.
Strings found in the beta describe it plainly: Gemini can help with tasks "like placing orders or booking rides, using screen automation on certain apps on your device." The emphasis on "certain apps" is doing a lot of work there. Google hasn't said which apps will support the feature, and given how often app interfaces change, the initial list is likely to be short.
The underlying plumbing showed up in Android 16 QPR3 Beta 2 last month, where a new "Screen automation" permission appeared under Settings > Apps > Special app access. The permission currently shows only on Pixel 10 devices, with three options: Always allow, Ask every time, or Don't allow.
The fine print
Google is already hedging. The beta includes the disclaimer that "Gemini can make mistakes" and warns users they're "responsible for what it does on your behalf, so supervise it closely." You can interrupt the AI and take over manually at any point, which seems less like a feature and more like an acknowledgment that autonomous app control isn't ready for unsupervised use.
The privacy language is blunt. With activity history enabled, screenshots of Gemini's interactions get reviewed by human trainers to improve the service. Google advises against entering login credentials or payment information into Gemini chats, and recommends skipping screen automation entirely for emergencies or sensitive tasks. That's a notable caveat for a feature whose main selling point is handling errands without user involvement.
Where this came from
Google previewed this capability at I/O 2025 under Project Astra, its research prototype for what it calls a "universal AI assistant." The demo showed Astra scrolling through PDFs, searching YouTube, and even calling a bike shop to check on part availability. The sped-up demo footage and limited subsequent details suggested the technology wasn't close to production.
That appears to be changing. The "bonobo" code in the Google app and the QPR3 permission framework indicate Google is moving from research demo to something it can actually ship.
When Android 16 QPR3 reaches stable release (expected around March), the screen automation permission will be in place. Whether Gemini actually uses it by then is another question.




