The most advanced lightweight multimodal AI model designed for mobile and edge devices
Gemma 3N is Google's latest open-source multimodal AI model, specifically optimized for on-device applications. It supports image, audio, video, and text input/output, designed for real-time processing on smartphones, tablets, and edge devices.
🚀 Latest Release: Part of Gemma family with 1.6+ billion downloads worldwide
Revolutionary technical advantages that make it the most advanced mobile AI model
Supports image, audio, video, text input/output natively
E2B (5B→2GB) and E4B (8B→3GB) memory efficiency
Supports Android, TPU, edge devices (Jetson etc.)
Custom model slicing for specific hardware
Native support for image, audio, video, and text input/output. Process diverse content types seamlessly.
Run 5B/8B parameter models with just 2GB/3GB memory. Optimized for mobile and edge devices.
Supports Android, TPU, and edge devices (Jetson etc.) with optimized performance for mobile deployment.
Mix-n-Match architecture allows for flexible model configurations tailored to your specific needs.
E4B model scores >1300 on LMArena (first in 10B+ category). Superior performance in real-world tasks.
KV Cache supports streaming input with 2x faster response times. Optimized for real-time applications.
Industry-leading performance metrics
Revolutionary innovations that make Gemma 3N the most advanced mobile AI model
Matryoshka Transformer - like "Russian nesting dolls", one model contains multiple sub-models with nested inference capabilities.
Per-Layer Embeddings - independent loading for each layer embedding, dramatically reducing memory pressure.
Context cache sharing designed for audio/video long sequence input scenarios, significantly accelerating first response time.
Built-in speech understanding capabilities based on USM (Universal Speech Model) with fine-grained context awareness.
Revolutionary MobileNet-V5-300M encoder designed for mobile-first vision understanding.
Choose the Gemma 3N model version that fits your project needs
Local Deploy
ollama pull gemma:3n
Revolutionary innovations that make Gemma 3N the most advanced mobile AI model
Gemma 3N E4B scores 1303 on LMArena, ranking first among 10B+ parameter models and competing with much larger proprietary models like Gemini 1.5 Pro and GPT-4.1-nano.
Built in collaboration with industry leaders for comprehensive platform support
Gemma 3N launched with support across 13+ platforms, making it the most comprehensive model release to date.
Join the growing community and explore additional resources
Access pre-trained models, code examples, and community contributions.
Explore Models →Join discussions, share experiences, and get help from the LocalLLaMA community.
Join Discussion →Everything you need to know about Gemma 3N
Gemma 3N is Google's latest lightweight multimodal AI model designed for mobile and edge devices. It supports text, image, audio, and video input/output with exceptional efficiency and performance.
You can use Gemma 3N through multiple platforms including Ollama, HuggingFace, or Google AI Studio. The easiest method is using Ollama: install Ollama, then run ollama pull gemma:3n
to download the model.
Yes, Gemma 3N is completely free to use. You can download the model for free, run it locally, or use Google AI Studio's free tier for online access.
Gemma 3N supports Windows, macOS, and Linux. The E2B model requires only 2GB RAM, while E4B needs 3GB RAM. It supports both CPU and GPU inference, with mobile-first optimization for Android and edge devices.