Google Launches Gemma – Lightweight Open LLMs

Author at ApiX-Drive

Reading time: ~2 min

Shortly after releasing its latest AI model, Gemini, Google announced the launch of Gemma, a new line of open, lightweight language models. The new products are based on similar scientific achievements and technological solutions on which Gemini is based. According to representatives of the corporation, the Gemma family was conceived as a tool for more conscious and safe creation of artificial intelligence. Its purpose is to provide developers with the tools to take a more responsible approach to AI development.

The first models in this series, Gemma 2B and Gemma 7B, are pre-trained variants, configured for guided use. Access to them is already open for both commercial and scientific use. And completely free. In addition to these models, Google provides a set of tools for developers called the Responsible Generative AI Toolkit to make working with AI easier.

Gemma 2B and 7B stand out from other AI models with their unique ability to achieve outstanding performance for their size. This is made possible by sharing a common base with Gemini, Google's most powerful AI model. The company claims that in MMLU testing, their new products are superior to such well-known open models as Mistral 7B and Llama 13B.

A significant advantage of Gemma 2B and 7B is their availability for use on standard computers, laptops, and in the Google cloud environment. This makes them an advantageous choice compared to their competitors. Another plus is optimization for working with NVIDIA GPUs. In addition, they integrate with popular services and tools such as Colab, Kaggle, Hugging Face, MaxText, and TensorRT-LLM.

Tris Warkentin, director of product management at Google DeepMind, highlighted the significant progress in the quality of content generation made over the past year. Now capabilities that were previously only available with the largest models become available with much smaller LLMs. Configuration on local devices with RTX GPUs or cloud TPUs in GCP significantly expands the developer experience. All this opens up completely new horizons for creating AI-based applications.