Our most talented models have been released so far

0 5 2 minutes read

Our most talented models have been released so far

At the edge, our E2B and E4B models redefine on-device utility, prioritizing multimodal capabilities, low-latency processing and seamless ecosystem integration over raw parameter computation.

It's powerful, it's accessible, it's open

To power the next generation of research infrastructure and products, we've sized Gemma 4 models specifically to run and fine-tune on hardware – from the billions of Android devices worldwide, to portable GPUs, all the way to developer workstations and accelerators.

Using these highly developed models, you can fine-tune the Gemma 4 to achieve optimal performance for your specific tasks. We have seen incredible success in this way; for example, INSAIT created the first Bulgarian language model (BgGPT), and we worked with Yale University on the Cell2Sentence-Scale to find new ways to treat cancer, among many others.

Here's what makes the Gemma 4 our most open model family yet:

Advanced thinking: Capable of multi-step programming and deep logic, Gemma 4 shows significant improvement in the following benchmarks for the calculations and instructions it requires.
Agent workflow: Native support for function calls, structured JSON output, and native system commands allow you to build independent agents that can interact with different tools and APIs and implement workflows reliably.
Code execution: Gemma 4 supports high-quality offline code, turning your workspace into the first local assistant for AI code.
Vision and sound: All models natively process video and images, support dynamic resolutions, and excel in visual tasks such as OCR and chart understanding. Additionally, the E2B and E4B models include native audio inputs for speech recognition and comprehension.
Long summary: Process long form content seamlessly. Edge models feature a 128K core window, while larger models offer up to 256K, allowing you to go through databases or long documents in no time.
140+ languages: Certified in more than 140 languages, Gemma 4 helps developers build engaging, high-performance applications for a global audience.

Different models for different hardware

We're releasing the Gemma 4 weight model in sizes designed for specific hardware and use cases, ensuring you get frontier class thinking wherever you need it:

Models 26B and 31B: Frontier intelligence, offline on your computers

Designed to provide researchers and developers with state-of-the-art computing on affordable hardware, our bfloat16 unmetered processors fit perfectly into a single 80GB NVIDIA H100 GPU. For local setups, scaled versions run natively on consumer GPUs to power your IDEs, coding assistants and agent workflows. Our 26B Mixture of Experts (MoE) focuses on latency, activating only 3.8 billion of its complete parameters during guesswork to deliver the fastest tokens per second, while our 31B Dense maximizes raw quality and provides a powerful foundation for fine-tuning.

E2B and E4B models: A new level of intelligence for mobile and IoT devices

Designed from the ground up for computing and memory efficiency, these models activate 2 billion and 4 billion active parameters during projection to preserve RAM and battery life. In close collaboration with our Google Pixel team and mobile hardware leaders such as Qualcomm Technologies and MediaTek, these multimodal models work completely offline with almost zero latency on all peripheral devices such as phones, Raspberry Pi, NVIDIA and Jetson Orin Nano. Android developers can now install an agent flow instance in the AICore developer preview today for future compatibility with Gemini Nano 4.

Open source license

He gave us feedback, and we listened. Building the future of AI requires a collaborative approach, and we believe in empowering the developer ecosystem without restrictive barriers. That's why Gemma 4 is released under the Apache 2.0 commercial license.

This open source license provides the basis for complete developer flexibility and digital sovereignty; it gives you complete control over your data, infrastructure, and models. It allows you to build freely and deploy securely in any environment, whether on-premises or in the cloud.

Source link

nimda 3 weeks ago

0 5 2 minutes read