Google Gemma 2 Model

Gemma 2: Built for Developers and Researchers

Bhavik Jikadara

--

Gemma 2 offers best-in-class performance, runs at incredible speed across different hardware and easily integrates with other AI tools.

Google announced the global release of Gemma 2, which is now accessible to researchers and developers everywhere. Gemma 2 is available in two sizes: 9 billion (9B) and 27 billion (27B) parameters. This new generation outperforms its predecessor and is more efficient at inference, incorporating significant safety enhancements. Remarkably, the 27B model competes with models over twice its size, offering performance levels previously attainable only with proprietary models as recently as December.

A New Standard in Open Models for Efficiency and Performance

Gemma 2 is built on a newly redesigned architecture, emphasizing both superior performance and inference efficiency. Here’s what sets it apart:

  • Exceptional Performance: The 27B Gemma 2 model offers unparalleled performance in its size class, rivaling models over twice its size. Similarly, the 9B model outperforms others in its category, including Llama 3 8B. Detailed performance metrics are available in the technical report.
  • Superior Efficiency and Cost Savings: Designed for efficient inference at full precision, the 27B Gemma 2 model operates on a single Google Cloud TPU host, NVIDIA A100 80GB Tensor Core GPU, or NVIDIA H100 Tensor Core GPU. This efficiency translates to lower deployment costs, making AI applications more accessible and affordable.
  • Rapid Inference Across Hardware: Optimized for speed, Gemma 2 runs incredibly fast on various hardware platforms, from high-end desktops and gaming laptops to cloud setups. You can experience Gemma 2’s full precision in Google AI Studio, use the quantized version with Gemma.cpp on your CPU, or run it on your home computer with an NVIDIA RTX or GeForce RTX via Hugging Face Transformers.
The official blog of Google Gemma 2

Built for Developers and Researchers

Gemma 2 is designed not only for power but also for seamless integration into your workflows:

  • Open and Accessible: Like its predecessors, Gemma 2 is available under a commercially friendly license, allowing developers and researchers to share and monetize their innovations.
  • Broad Framework Compatibility: Gemma 2 works effortlessly with major AI frameworks such as Hugging Face Transformers, JAX, PyTorch, and TensorFlow via native Keras 3.0, vLLM, and Gemma.cpp, Llama.cpp, and Ollama. Additionally, Gemma is optimized for NVIDIA TensorRT-LLM to run on NVIDIA-accelerated infrastructure or as an NVIDIA NIM inference microservice, with upcoming optimization for NVIDIA’s NeMo. You can fine-tune models today using Keras and Hugging Face, with more parameter-efficient fine-tuning options on the way.
  • Effortless Deployment: Beginning next month, Google Cloud customers can deploy and manage Gemma 2 easily on Vertex AI.

The new Gemma Cookbook offers practical examples and recipes for building applications and fine-tuning Gemma 2 models for specific tasks. Learn to use Gemma efficiently with your preferred tools, including common tasks like retrieval-augmented generation.

Projects built with Gemma

Our first Gemma launch led to more than 10 million downloads and countless inspiring projects. Navarasa, for instance, used Gemma to create a model rooted in India’s linguistic diversity.

Gemma 2 empowers developers to take on more ambitious AI projects with higher performance and greater potential. We will keep innovating with new architectures and specialized variants, including an upcoming 2.6B parameter model that balances accessibility and power. For more details, check out the technical report.

How to use and Install?

Gemma 2 is now available in Google AI Studio, so you can test out its full performance capabilities at 27B without hardware requirements.

You can also download Gemma 2’s model weights from Kaggle and Hugging Face Models, with Vertex AI Model Garden coming soon.

To enable access for research and development, Gemma 2 is also available free of charge through Kaggle or via a free tier for Colab notebooks.

Notes: First-time Google Cloud customers may be eligible for $300 in credits. Academic researchers can apply for the Gemma 2 Academic Research Program to receive Google Cloud credits to accelerate their research with Gemma 2. Applications are open now through August 9.

--

--

No responses yet