Google recently announced an upgrade to the Gemini assistant in Android Studio, adding support for multimodal input. Developers can now attach images to prompts for visual assistance during app development.

First showcased at I/O 2024, the upgraded Gemini can now "understand simple wireframes and convert them into usable Jetpack Compose code." The "Ask Gemini" field in the Canary version of Android Studio Narwal includes a new "Attach image file" option (supporting JPEG or PNG formats). Google recommends using images with "strong color contrast" and providing "clear prompts" for optimal results.
Developers can upload various screenshots and UI mockups, ranging from simple wireframes to high-fidelity models, and specify the desired functionality. For example, in a calculator design example, one could ask to "make the interaction and calculation work as expected."

Typical prompts for converting visual designs into functional UI code include: 1. "For this provided image, write Android Jetpack Compose code to produce a screen as close as possible to this image. Ensure to include imports, use Material3, and document the code." 2. "For this provided image, write Android Jetpack Compose code to produce a screen as close as possible to this image, be creative with the colors. Make the interaction and calculation work as expected. Ensure to include imports, use Material3, and document the code."

Google positions Gemini as a tool for providing an "initial design framework," with the generated code often requiring further editing and refinement. Common improvements include ensuring correct imports for drawables and icons. Google suggests viewing the generated code as an efficient starting point to accelerate the UI development workflow.
Furthermore, Gemini's visual analysis capabilities can be used to identify and resolve errors. Developers can "upload a screenshot of a problematic UI, and Gemini will analyze the image and suggest potential solutions." Developers can also attach relevant code snippets for more precise assistance.
Android Studio's Gemini also supports uploading architecture diagrams and receiving explanations or documentation, similar to the Gemini Astra glasses functionality shown at the I/O conference.
