The Final Evolution of AI Assistants: Gemini Task Automation Goes Live, Smartphones Start Doing Things for You

Google's long-awaited AI vision is finally becoming a reality with the release of . Today, jointly announced that the "task automation" feature based on Gemini has entered the Beta testing phase. This feature marks a transformation of AI assistants from mere "information seekers" to "digital assistants" capable of performing cross-app tasks, simulating human operations to complete complex processes such as ordering food and hailing a taxi.

Visual Impact: Watching the Phone "Use Itself"

Differently from traditional API integration, 's automation feature simulates real user operations within a virtual window:

Smart Taxi Hailing: When you give the instruction "Hail a taxi to the airport," will automatically open Uber, confirm the specific terminal (if there are multiple terminals, it will ask proactively), and automatically fill in the destination.
Ordering Food: When processing the instruction "Order me a coffee and a croissant," the AI will independently scroll through the screen to find specific items on the Starbucks menu (such as Flat White), and even handle complex scrolling selections like a human would.

Security Logic: Human Control at Key Points

To avoid the risks associated with autonomy, Google has implemented a strict human review mechanism in the automation process:

Explicit Operation: Users can watch 's every step in real-time and take control or terminate the automation process at any time.

Last Confirmation: Before submitting an order or payment, the system will stop at the payment screen, waiting for the user to verify the details and manually click "Confirm," ensuring that each transaction is completed under controlled conditions.

Currently, this feature is prioritized for delivery and ride-hailing applications. For and subsequent users, the phone is no longer just a carrier for running apps but a "super agent" that can understand natural language intent and convert it into specific actions.

Although AI occasionally appears somewhat "clumsy" in scrolling menus and identifying options, this automation model that does not require deep API adaptation and instead works directly with UI interactions greatly expands the application boundaries of AI assistants. With algorithm iterations, we are moving away from the era of repeatedly switching between apps and entering a truly intelligent stage where all small tasks can be completed with a single sentence.

Experience Upgraded! Google Gemini's Personalized AI Drawing Feature Now Free for US Users

Google has announced that the personalized AI image generation feature on the Gemini platform is now available for free to users across the United States. Once-a-paid-only feature is now accessible to the general public. Users can now customize and lock in their own unique visual style, eliminating the hassle of repeatedly describing the style and background each time they generate an image, making the creative process more efficient.

The Final Evolution of AI Assistants: Gemini Task Automation Goes Live, Smartphones Start Doing Things for You

Visual Impact: Watching the Phone "Use Itself"

Security Logic: Human Control at Key Points

Related Recommendations

The Bottleneck of Computing Power Shortage Becomes Evident: Google Restricts Meta's Access to the Gemini AI Model

Cost Pressure Transmission: Amazon Adjusts Anthropic Model Pricing Model, Starting Next Year Charge by Token

Experience Upgraded! Google Gemini's Personalized AI Drawing Feature Now Free for US Users

Google Gemini Free Users Get Personalized Image Generation Features, AI Experience Understands You Better

The Cost of the AI Big Model Boom: Apple Users May Pay for Entire Hardware Price Increases