Why XR Has Always Been Hard — Until Now
Building extended reality (XR) apps has historically meant choosing between two bad options: wrestle with Unity or Unreal for weeks to get something barely running, or stitch together fragmented perception pipelines, sensor SDKs, and WebXR primitives by hand. Neither path rewards the kind of rapid, exploratory prototyping that moves ideas forward.
That barrier is now cracking open. Vibe Coding XR — a rapid prototyping workflow published by Google Research in March 2026 — combines the open-source XR Blocks framework with Gemini‘s reasoning capabilities inside Gemini Canvas to turn plain-English prompts into fully interactive, physics-aware WebXR applications.
What Is XR Blocks?
XR Blocks is an open-source WebXR SDK built on top of three.js, TensorFlow, and Gemini. Its core mission is “minimum code from idea to reality.” Instead of writing raw WebXR scene management, you work with high-level, composable modules that handle the hard parts — depth, physics, gestures, spatial UI — so you can focus on what your XR experience actually does.
Presented at ACM UIST 2025, XR Blocks was designed specifically to close the gap between AI research tooling (JAX, PyTorch, TensorFlow with mature benchmarks) and XR development, which has remained fragmented and high-friction by comparison.
Core Modules
- USERHand tracking & gesture recognition — pinch, grasp, and custom gesture models wired to your scene objects automatically.
- WORLDEnvironmental perception — depth-aware physics, geometry occlusion, lighting estimation. Your virtual objects behave like they belong in the real space.
- INTERFACESpatial UI — menus, labels, and HUD elements that anchor correctly in 3D space on both simulated desktop and real headsets.
- AIGemini integration — embed Gemini Live, LiteRT on-device models, and TensorFlow Lite inference directly into your XR scene with minimal wiring.
- AGENTSAgentic behaviors — context-aware assistants and proactive suggestion engines that respond to user intent in spatial context.
The Vibe Coding XR Workflow
The workflow pairs the XR Blocks framework with a custom Gemini Gem (called XR Blocks Gem) loaded into Gemini Canvas. You describe what you want in natural language; Gemini translates that into structured XR Blocks code; and you get a deployable WebXR app that runs in both a desktop Chrome simulator and on Android XR headsets.
Go to gemini.google.com, load the XR Blocks Gem, and select “Pro Mode” for best one-shot success rates.
Describe the spatial experience, interactions, objects, physics, and any AI behaviors you want. Be specific but natural — this is vibe coding, not boilerplate specification.
Gemini maps your intent to XR Blocks modules — world perception, gesture bindings, spatial UI, and physics. The output is clean, readable JavaScript using the XR Blocks API.
Test your XR experience immediately in Chrome’s WebXR emulator. No headset required for the initial iteration loop.
The same code ships unchanged to Android XR headsets (Galaxy XR). No separate build pipeline; the abstraction layer handles platform differences.
Real Prompt Examples (What Gets Built)
These four examples were all generated by Gemini via the Vibe Coding XR workflow — no hand-written XR code, no game engine setup:
Math Tutor in XR
Euler’s theorem visualized in 3D. Pinch to highlight vertices, edges, and faces across multiple geometry examples.
Physics Lab
Grab and drop labeled weights onto a balance scale. Real physics, real haptic feedback.
XR Volleyball
Textured volleyballs launched from a ring, colliding with both your hands and room geometry.
XR Dino Game
The Chrome dinosaur game rebuilt in mixed reality, voxelized in your space. Went from concept to running app in minutes.
Which Gemini Model Should You Use?
Not all Gemini models are equal for XR Blocks code generation. Google evaluated multiple models against the VCXR-60 benchmark dataset in March 2026. Here’s what the numbers say:
| Model | Mode | Success Rate | Avg. Gen Time | Best For |
|---|---|---|---|---|
| Gemini 2.5 Pro | Pro Mode | > 95% | ~60–90s | Highest accuracy, complex prototypes |
| Gemini 2.5 Flash | Low Thinking | 87.4% | ~17s | Speed priority, rapid iteration |
| Gemini 2.5 Flash | Pro Mode | ~91% | ~35s | Good balance of speed + quality |
Understanding the XR Blocks Architecture (For Developers)
XR Blocks uses a Reality Model — a set of high-level composable abstractions that sit between your prompt and the raw WebXR/three.js engine layer. Unlike a World Model trained end-to-end, the Reality Model gives you replaceable, auditable modules. This is what makes Gemini-generated code predictable and debuggable rather than opaque.
The central concept is the Script — the narrative and logical heart of any XR Blocks app. A Script wires together input events, AI calls, world state, and UI updates into a coherent experience loop. When Gemini generates XR Blocks code, it’s really generating a Script that calls the right modules in the right order.
The architectural philosophy draws explicitly from Python’s Zen: readability counts. Every XR Blocks API is designed to be understood at a glance, which is exactly what makes LLM code generation for it so reliable.
⚡ Quick Start: Your First XR Blocks App
Three paths to get hands-on immediately:
Option A — Vibe Coding (No code knowledge needed)
// 1. Go to gemini.google.com
// 2. Load the XR Blocks Gem
// 3. Type your prompt and hit enter
"Create a solar system in XR where I can
grab planets with my hands and see their
orbital data as floating labels."Option B — Fork the GitHub repo
git clone https://github.com/xrblocks/xrblocks
cd xrblocks
# Browse /templates and /samples
# Each folder is a standalone XR app
npm install && npm run devOption C — Start from an XR Blocks template
// Minimal XR Blocks scene
import { XRScene, World, User, AI } from 'xrblocks';
const scene = new XRScene();
const world = scene.addModule(new World({ physics: true }));
const user = scene.addModule(new User({ hands: true }));
const ai = scene.addModule(new AI({ model: 'gemini-2.5-flash' }));
user.on('pinch', (object) => ai.describe(object));Where This Is Heading
The XR Blocks team has a clear roadmap: the current xrblocks.js web SDK is the first step. Future versions are planned to extend to native platforms via LLM-powered compilers — meaning the same prompt-to-XR pipeline that works in the browser today will eventually compile down to native Android XR and other hardware targets.
The larger vision is closing the virtuous cycle that exists in AI research but not yet in XR: a thriving ecosystem of reproducible demos, shared benchmarks, and community-iterated components. Every demo built with XR Blocks is meant to be reusable by others — turning individual prototypes into building blocks for the next one.
Google Research is explicitly inviting the HCI, AI, and XR communities to contribute to the XR Blocks ecosystem. The benchmark dataset (VCXR-60) and the Gemini Gem configuration are both available alongside the framework.




