Gemma 2: Built for Developers and Researchers

4 min readJul 31, 2024

Gemma 2 offers best-in-class performance, runs at incredible speed across different hardware and easily integrates with other AI tools.

Google announced the global release of Gemma 2, which is now accessible to researchers and developers everywhere. Gemma 2 is available in two sizes: 9 billion (9B) and 27 billion (27B) parameters. This new generation outperforms its predecessor and is more efficient at inference, incorporating significant safety enhancements. Remarkably, the 27B model competes with models over twice its size, offering performance levels previously attainable only with proprietary models as recently as December.

A New Standard in Open Models for Efficiency and Performance

Gemma 2 is built on a newly redesigned architecture, emphasizing both superior performance and inference efficiency. Here’s what sets it apart:

Exceptional Performance: The 27B Gemma 2 model offers unparalleled performance in its size class, rivaling models over twice its size. Similarly, the 9B model outperforms others in its category, including Llama 3 8B. Detailed performance metrics are available in the technical report.
Superior Efficiency and Cost Savings: Designed for efficient inference at full precision, the 27B Gemma 2 model operates on a single Google Cloud TPU host, NVIDIA A100 80GB Tensor Core GPU, or NVIDIA H100 Tensor Core GPU. This efficiency translates to lower deployment costs, making AI applications more accessible and affordable.
Rapid Inference Across Hardware: Optimized for speed, Gemma 2 runs incredibly fast on various hardware platforms, from high-end desktops and gaming laptops to cloud setups. You can experience Gemma 2’s full precision in Google AI Studio, use the quantized version with Gemma.cpp on your CPU, or run it on your home computer with an NVIDIA RTX or GeForce RTX via Hugging Face Transformers.

Built for Developers and Researchers

Gemma 2 is designed not only for power but also for seamless integration into your workflows:

Open and Accessible: Like its predecessors, Gemma 2 is available under a commercially friendly license, allowing developers and researchers to share and monetize their innovations.
Broad Framework Compatibility: Gemma 2 works effortlessly with major AI frameworks such as Hugging Face Transformers, JAX, PyTorch, and TensorFlow via native Keras 3.0, vLLM, and Gemma.cpp, Llama.cpp, and Ollama. Additionally, Gemma is optimized for NVIDIA TensorRT-LLM to run on NVIDIA-accelerated infrastructure or as an NVIDIA NIM inference microservice, with upcoming optimization for NVIDIA’s NeMo. You can fine-tune models today using Keras and Hugging Face, with more parameter-efficient fine-tuning options on the way.
Effortless Deployment: Beginning next month, Google Cloud customers can deploy and manage Gemma 2 easily on Vertex AI.

The new Gemma Cookbook offers practical examples and recipes for building applications and fine-tuning Gemma 2 models for specific tasks. Learn to use Gemma efficiently with your preferred tools, including common tasks like retrieval-augmented generation.

Projects built with Gemma

Our first Gemma launch led to more than 10 million downloads and countless inspiring projects. Navarasa, for instance, used Gemma to create a model rooted in India’s linguistic diversity.

Gemma 2 empowers developers to take on more ambitious AI projects with higher performance and greater potential. We will keep innovating with new architectures and specialized variants, including an upcoming 2.6B parameter model that balances accessibility and power. For more details, check out the technical report.

How to use and Install?

Gemma 2 is now available in Google AI Studio, so you can test out its full performance capabilities at 27B without hardware requirements.

Google AI Studio | Gemini API | Google for Developers

Google AI Studio is the fastest way to start building with Gemini, our next generation family of multimodal generative…

aistudio.google.com

You can also download Gemma 2’s model weights from Kaggle and Hugging Face Models, with Vertex AI Model Garden coming soon.

Google | Gemma 2 | Kaggle

Gemma is a family of lightweight, state-of-the-art open models from Google, built from the same research and technology…

www.kaggle.com

google/gemma-2-9b · Hugging Face

We're on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

To enable access for research and development, Gemma 2 is also available free of charge through Kaggle or via a free tier for Colab notebooks.

Google | Gemma 2 | Kaggle

Gemma is a family of lightweight, state-of-the-art open models from Google, built from the same research and technology…

www.kaggle.com

Notes: First-time Google Cloud customers may be eligible for $300 in credits. Academic researchers can apply for the Gemma 2 Academic Research Program to receive Google Cloud credits to accelerate their research with Gemma 2. Applications are open now through August 9.

Gemma 2: Built for Developers and Researchers

A New Standard in Open Models for Efficiency and Performance

Built for Developers and Researchers

Projects built with Gemma

How to use and Install?

Google AI Studio | Gemini API | Google for Developers

Google AI Studio is the fastest way to start building with Gemini, our next generation family of multimodal generative…

Google | Gemma 2 | Kaggle

Gemma is a family of lightweight, state-of-the-art open models from Google, built from the same research and technology…

google/gemma-2-9b · Hugging Face

We're on a journey to advance and democratize artificial intelligence through open source and open science.

Google | Gemma 2 | Kaggle

Gemma is a family of lightweight, state-of-the-art open models from Google, built from the same research and technology…

Written by Bhavik Jikadara

No responses yet