Skip to main content Accessibility Help Accessibility Feedback
Console Logo
Console Logo

Vertex AI

Dashboard
Model Garden
Pipelines

Colab Enterprise
Workbench

Overview
Create prompt
Media Studio
Stream realtime
Prompt gallery
Prompt management
Tuning

Agent Garden
Agent Engine
Vertex AI Search
Vector Search

Feature Store
Datasets

Training
Experiments
Metadata
Ray on Vertex AI

Model Registry
Online prediction
Batch predictions
Monitoring
Marketplace

Gemma

Gemma

Lightweight, state-of-the-art open models from Google, built from the same research and technology used to create the Gemini models
Overview
Use cases
Documentation
License

Overview

Gemma is a family of lightweight, state-of-the-art open models built from research and technology used to create Google Gemini models. They are text-to-text, decoder-only large language models, available in English, with open weights, pre-trained variants, and instruction-tuned variants.

Gemma models are well-suited for a variety of text generation tasks, including question answering, summarization, and reasoning. Their relatively small size makes it possible to deploy them in environments with limited resources such as a laptop, desktop or your own cloud infrastructure, democratizing access to state of the art AI models and helping foster innovation for everyone.

This model card includes the 2B and 7B model variants.

Use cases

Open Large Language Models (LLMs) have a wide range of applications across various industries and domains. The following list of potential uses is not comprehensive. The purpose of this list is to provide contextual information about the possible use-cases that the model creators considered as part of model training and development.

  • Content Creation and Communication
    • Text Generation: These models can be used to generate creative text formats such as poems, scripts, code, marketing copy and email drafts.
    • Chatbots and Conversational AI: Power conversational interfaces for customer service, virtual assistants, or interactive applications.
    • Text Summarization: Generate concise summaries of a text corpus, research papers, or reports.
  • Research and Education
    • Natural Language Processing (NLP) Research: These models can serve as a foundation for researchers to experiment with NLP techniques, develop algorithms, and contribute to the advancement of the field.
    • Language Learning Tools: Support interactive language learning experiences, aiding in grammar correction or providing writing practice.
    • Knowledge Exploration: Assist researchers in exploring large bodies of text by generating summaries or answering questions about specific topics.

Documentation

Get started

You can deploy Gemma to Vertex AI or Google Kubernetes Engine (GKE).

To use this model, sign in to your Google Account and acceept the Terms of Use.

Dataset and training

These models were trained on a large dataset of text data that includes a wide variety of sources, totaling 8 trillion tokens. Here are the key components:

  • Web Documents: A diverse collection of web text ensures the model is exposed to a broad range of linguistic styles, topics, and vocabulary. Primarily English-language content.
  • Code: Exposing the model to code helps it to learn the syntax and patterns of programming languages, which could improve its ability to generate code or understand code-related questions.
  • Mathematics: Training on mathematical text helps the model learn logical reasoning, symbolic representation, and to address mathematical queries.

The combination of these diverse data sources is crucial for training a powerful language model that can handle a wide variety of different tasks and text formats.

Training data processing

Here are the key data cleaning and filtering methods applied to the training data:

  • CSAM Filtering: Rigorous CSAM (Child Sexual Abuse Material) filtering was applied at multiple stages in the data preparation process to ensure the exclusion of harmful and inappropriate content.
  • PII Filtering: PII (Personally Identifiable Information) filtering was performed using a specialized privacy protection tool to protect the privacy of individuals. Identifiers such as social security numbers and other sensitive information types were removed.
  • Additional methods: Filtering based on content quality and safety in line with our policies.

Fast Deployment Option

Preview

This feature is a preview offering, subject to the "Pre-GA Offerings Terms" of the Service Specific Terms. Pre-GA products and features may have limited support, and changes to pre-GA products and features may not be compatible with other pre-GA versions. For more information, see the launch stage descriptions.

The Fast Deployment feature prioritizes speed for model exploration, making it ideal for initial testing and experimentation. For sensitive data or production workloads, use the Standard environment for enhanced security and stability.

Hardware

Gemma was trained using the latest generation of Tensor Processing Unit (TPU) hardware (TPUv5e).

Training large language models requires significant computational power. TPUs, designed specifically for matrix operations common in machine learning, offer several advantages in this domain:

  • Performance: TPUs are specifically designed to handle the massive computations involved in training LLMs. They can speed up training considerably compared to CPUs.
  • Memory: TPUs often come with large amounts of high-bandwidth memory, allowing for the handling of large models and batch sizes during training. This can lead to better model quality.
  • Scalability: TPU Pods (large clusters of TPUs) provide a scalable solution for handling the growing complexity of large foundation models. You can distribute training across multiple TPU devices for faster and more efficient processing.
  • Cost-effectiveness: In many scenarios, TPUs can provide a more cost-effective solution for training large models compared to CPU-based infrastructure, especially when considering the time and resources saved due to faster training.

These advantages are aligned with Google's commitments to operate sustainably.

Software

Training was done using JAX and ML Pathways.

JAX allows researchers to leverage the latest generation of hardware, including TPUs, for faster and more efficient training of large models.

ML Pathways is Google's latest effort to build artificially intelligent systems capable of generalizing across multiple tasks. This is specially suitable for foundation models, including large language models like these ones.

Together, JAX and ML Pathways are used as described in the paper about the Gemini family of models; "the 'single controller' programming model of Jax and Pathways allows a single Python process to orchestrate the entire training run, dramatically simplifying the development workflow."

Evaluation metrics

These models were evaluated against a large collection of different datasets and metrics to cover different aspects of text generation:

Benchmark Metric 2B Params 7B Params
MMLU 5-shot, top-1 42.3 64.3
HellaSwag 0-shot 71.4 81.2
PIQA 0-shot 77.3 81.2
SocialIQA 0-shot 49.7 51.8
BoolQ 0-shot 69.4 83.2
WinoGrande partial score 65.4 72.3
CommonsenseQA 7-shot 65.3 71.3
OpenBookQA 47.8 52.8
ARC-e 73.2 81.5
ARC-c 42.1 53.2
TriviaQA 5-shot 53.2 63.4
Natural Questions 5-shot 12.5 23.0
HumanEval pass@1 22.0 32.3
MBPP 3-shot 29.2 44.4
GSM8K maj@1 17.7 46.4
MATH 4-shot 11.8 24.3
AGIEval 24.2 41.7
BIG-Bench 35.2 55.1
Average 44.9 56.4

Ethics and safety

Our evaluation methods include structured evaluations and internal red-teaming testing of relevant content policies. Red-teaming was conducted by a number of different teams, each with different goals and human evaluation metrics. These models were evaluated against a number of different categories relevant to ethics and safety, including:

  • Text-to-Text Content Safety: Human evaluation on prompts covering safety policies including child sexual abuse and exploitation, harassment, violence and gore, and hate speech.
  • Text-to-Text Representational Harms: Benchmark against relevant academic datasets such as WinoBias and BBQ Dataset.
  • Memorization: Automated evaluation of memorization of training data, including the risk of personally identifiable information exposure.
  • Large-scale harm: Tests for “dangerous capabilities,” such as chemical, biological, radiological, and nuclear (CBRN) risks.

Evaluation results

The results of ethics and safety evaluations are within acceptable thresholds for meeting internal policies for categories such as child safety, content safety, representational harms, memorization, large-scale harms.

On top of robust internal evaluation, we report numbers on safety benchmarks like BBQ, BOLD, Winogender, Winoboas, RealToxicity, and TruthfulQA are shown here.

Benchmark Metric 2B Params 7B Params
RealToxicity average 6.86 7.90
BOLD 45.57 49.08
CrowS-Pairs top-1 45.82 51.33
BBQ Ambig 1-shot, top-1 62.58 92.54
BBQ Disambig top-1 54.62 71.99
Winogender top-1 51.25 54.17
TruthfulQA 44.84 31.81
Winobias 1_2 56.12 59.09
Winobias 2_2 91.10 92.23
Toxigen 29.77 39.59

Best practices and limitations

Like any large language model, these models have certain limitations that users should be aware of.

  • Training Data
    • The quality and diversity of the training data significantly influence the model's capabilities. Biases or gaps in the training data can lead to limitations in the model's responses.
    • The scope of the training dataset determines the subject areas the model can handle effectively.
  • Context and Task Complexity
    • LLMs are better at tasks that can be framed with clear prompts and instructions. Open-ended or highly complex tasks might be challenging.
    • A model's performance can be influenced by the amount of context provided (longer context generally leads to better outputs, up to a certain point).
  • Language Ambiguity and Nuance
    • Natural language is inherently complex. LLMs might struggle to grasp subtle nuances, sarcasm, or figurative language.
  • Factual Accuracy
    • LLMs generate responses based on information they learned from their training datasets, but they are not knowledge bases. They may generate incorrect or outdated factual statements.
  • Common Sense
    • LLMs rely on statistical patterns in language. They might lack the ability to apply common sense reasoning in certain situations.

Ethical considerations and risks

The development of large language models (LLMs) raises several ethical concerns. In creating an open model, we have carefully considered the following:

  • Bias and Fairness
    • LLMs trained on large-scale, real-world text data can reflect socio-cultural biases embedded in the training material. These models underwent careful scrutiny, input data pre-processing described and posterior evaluations reported in this card.
  • Misinformation and Misuse
    • LLMs can be misused to generate text that is false, misleading, or harmful.
    • Guidelines are provided for responsible use with the model, see the Responsible Generative AI Toolkit.
  • Transparency and Accountability:
    • This model card summarizes details on the models’ architecture, capabilities, limitations, and evaluation processes.
    • A responsibly developed open model offers the opportunity to share innovation by making LLM technology accessible to developers and researchers across the AI ecosystem.
  • Risks Identified and Mitigations:
    • Perpetuation of biases: It's encouraged to perform continuous monitoring (using evaluation metrics, human review) and the exploration of de-biasing techniques during model training,fine-tuning, and other use cases.
    • Generation of harmful content: Mechanisms and guidelines for content safety are essential. Developers are encouraged to exercise caution and implement appropriate content safety safeguards based on their specific product policies and application use cases.
    • Misuse for malicious purposes: Technical limitations and developer and end-user education can help mitigate against malicious applications of LLMs. Educational resources and reporting mechanisms for users to flag misuse are provided. Prohibited uses of Gemma models are outlined in our Terms of Use.
    • Privacy violations: Models were trained on data filtered for removal of PII (Personally Identifiable Information). Developers are encouraged to adhere to privacy regulations with privacy-preserving techniques.

Versions

Resource ID Release date Release stage Description
gemma 2024-02-21 GA

Links

  • Gemma on Kaggle
  • Responsible Generative AI Toolkit
Try out Gemma
us-central1 (Iowa)
Demo playground (Free)7b-it
To test this model, you will need to deploy it to a Vertex AI endpoint.
Text that describes what you want to generate
* This demo is for internal testing purposes only. Output should not be saved or distributed. Please do not provide personally identifiable information or other data subject to regulatory requirements.

Model ID

publishers/google/models/gemma

Version name

gemma-2b

Tags

Task

Generation

Skill level

Beginner Intermediate

License

Gemma Terms of Use

By using, reproducing, modifying, distributing, performing or displaying any portion or element of Gemma, Model Derivatives including via any Hosted Service, (each as defined below) (collectively, the "Gemma Services") or otherwise accepting the terms of this Agreement, you agree to be bound by this Agreement.

Section 1: DEFINITIONS

1.1 Definitions

(a) "Agreement" or "Gemma Terms of Use" means these terms and conditions that govern the use, reproduction, Distribution or modification of the Gemma Services and any terms and conditions incorporated by reference.

(b) "Distribution" or "Distribute" means any transmission, publication, or other sharing of Gemma or Model Derivatives to a third party, including by providing or making Gemma or its functionality available as a hosted service via API, web access, or any other electronic or remote means ("Hosted Service").

(c) "Gemma" means the set of machine learning language models, trained model weights and parameters identified in the Appendix, regardless of the source that you obtained it from.

(d) "Google" means Google LLC.

(e) "Model Derivatives" means all (i) modifications to Gemma, (ii) works based on Gemma, or (iii) any other machine learning model which is created by transfer of patterns of the weights, parameters, operations, or Output of Gemma, to that model in order to cause that model to perform similarly to Gemma, including distillation methods that use intermediate data representations or methods based on the generation of synthetic data Outputs by Gemma for training that model. For clarity, Outputs are not deemed Model Derivatives.

(f) "Output" means the information content output of Gemma or a Model Derivative that results from operating or otherwise using Gemma or the Model Derivative, including via a Hosted Service.

1.2

As used in this Agreement, "including" means "including without limitation".

Section 2: ELIGIBILITY AND USAGE

2.1 Eligibility

You represent and warrant that you have the legal capacity to enter into this Agreement (including being of sufficient age of consent). If you are accessing or using any of the Gemma Services for or on behalf of a legal entity, (a) you are entering into this Agreement on behalf of yourself and that legal entity, (b) you represent and warrant that you have the authority to act on behalf of and bind that entity to this Agreement and (c) references to "you" or "your" in the remainder of this Agreement refers to both you (as an individual) and that entity.

2.2 Use

You may use, reproduce, modify, Distribute, perform or display any of the Gemma Services only in accordance with the terms of this Agreement, and must not violate (or encourage or permit anyone else to violate) any term of this Agreement.

Section 3: DISTRIBUTION AND RESTRICTIONS

3.1 Distribution and Redistribution

You may reproduce or Distribute copies of Gemma or Model Derivatives if you meet all of the following conditions:

  1. You must include the use restrictions referenced in Section 3.2 as an enforceable provision in any agreement (e.g., license agreement, terms of use, etc.) governing the use and/or distribution of Gemma or Model Derivatives and you must provide notice to subsequent users you Distribute to that Gemma or Model Derivatives are subject to the use restrictions in Section 3.2.
  2. You must provide all third party recipients of Gemma or Model Derivatives a copy of this Agreement.
  3. You must cause any modified files to carry prominent notices stating that you modified the files.
  4. All Distributions (other than through a Hosted Service) must be accompanied by a "Notice" text file that contains the following notice: "Gemma is provided under and subject to the Gemma Terms of Use found at ai.google.dev/gemma/terms".

You may add your own intellectual property statement to your modifications and, except as set forth in this Section, may provide additional or different terms and conditions for use, reproduction, or Distribution of your modifications, or for any such Model Derivatives as a whole, provided your use, reproduction, modification, Distribution, performance, and display of Gemma otherwise complies with the terms and conditions of this Agreement. Any additional or different terms and conditions you impose must not conflict with the terms of this Agreement.

3.2 Use Restrictions

You must not use any of the Gemma Services:

  1. for the restricted uses set forth in the Gemma Prohibited Use Policy at ai.google.dev/gemma/prohibited_use_policy ("Prohibited Use Policy"), which is hereby incorporated by reference into this Agreement; or
  2. in violation of applicable laws and regulations.

To the maximum extent permitted by law, Google reserves the right to restrict (remotely or otherwise) usage of any of the Gemma Services that Google reasonably believes are in violation of this Agreement.

3.3 Generated Output

Google claims no rights in Outputs you generate using Gemma. You and your users are solely responsible for Outputs and their subsequent uses.

Section 4: ADDITIONAL PROVISIONS

4.1 Updates

Google may update Gemma from time to time.

4.2 Trademarks

Nothing in this Agreement grants you any rights to use Google's trademarks, trade names, logos or to otherwise suggest endorsement or misrepresent the relationship between you and Google. Google reserves any rights not expressly granted herein.

4.3 DISCLAIMER OF WARRANTY

UNLESS REQUIRED BY APPLICABLE LAW, THE GEMMA SERVICES, AND OUTPUTS, ARE PROVIDED ON AN "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING ANY WARRANTIES OR CONDITIONS OF TITLE, NON-INFRINGEMENT, MERCHANTABILITY, OR FITNESS FOR A PARTICULAR PURPOSE. YOU ARE SOLELY RESPONSIBLE FOR DETERMINING THE APPROPRIATENESS OF USING, REPRODUCING, MODIFYING, PERFORMING, DISPLAYING OR DISTRIBUTING ANY OF THE GEMMA SERVICES OR OUTPUTS AND ASSUME ANY AND ALL RISKS ASSOCIATED WITH YOUR USE OR DISTRIBUTION OF ANY OF THE GEMMA SERVICES OR OUTPUTS AND YOUR EXERCISE OF RIGHTS AND PERMISSIONS UNDER THIS AGREEMENT.

4.4 LIMITATION OF LIABILITY

TO THE FULLEST EXTENT PERMITTED BY APPLICABLE LAW, IN NO EVENT AND UNDER NO LEGAL THEORY, WHETHER IN TORT (INCLUDING NEGLIGENCE), PRODUCT LIABILITY, CONTRACT, OR OTHERWISE, UNLESS REQUIRED BY APPLICABLE LAW, SHALL GOOGLE OR ITS AFFILIATES BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY DIRECT, INDIRECT, SPECIAL, INCIDENTAL, EXEMPLARY, CONSEQUENTIAL, OR PUNITIVE DAMAGES, OR LOST PROFITS OF ANY KIND ARISING FROM THIS AGREEMENT OR RELATED TO, ANY OF THE GEMMA SERVICES OR OUTPUTS EVEN IF GOOGLE OR ITS AFFILIATES HAVE BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.

4.5 Term, Termination, and Survival

The term of this Agreement will commence upon your acceptance of this Agreement (including acceptance by your use, modification, or Distribution, reproduction, performance or display of any portion or element of the Gemma Services) and will continue in full force and effect until terminated in accordance with the terms of this Agreement. Google may terminate this Agreement if you are in breach of any term of this Agreement. Upon termination of this Agreement, you must delete and cease use and Distribution of all copies of Gemma and Model Derivatives in your possession or control. Sections 1, 2.1, 3.3, 4.2 to 4.9 shall survive the termination of this Agreement.

4.6 Governing Law and Jurisdiction

This Agreement will be governed by the laws of the State of California without regard to choice of law principles. The UN Convention on Contracts for the International Sale of Goods does not apply to this Agreement. The state and federal courts of Santa Clara County, California shall have exclusive jurisdiction of any dispute arising out of this Agreement.

4.7 Severability

If any provision of this Agreement is held to be invalid, illegal or unenforceable, the remaining provisions shall be unaffected thereby and remain valid as if such provision had not been set forth herein.

4.8 Entire Agreement

This Agreement states all the terms agreed between the parties and supersedes all other agreements between the parties as of the date of acceptance relating to its subject matter.

4.9 No Waiver

Google will not be treated as having waived any rights by not exercising (or delaying the exercise of) any rights under this Agreement.

Gemma Prohibited Use Policy

Google reserves the right to update this Gemma Prohibited Use Policy from time to time.

You may not use nor allow others to use Gemma or Model Derivatives to:

  1. Generate any content, including the outputs or results generated by Gemma or Model Derivatives, that infringes, misappropriates, or otherwise violates any individual's or entity's rights (including, but not limited to rights in copyrighted content).
  2. Perform or facilitate dangerous, illegal, or malicious activities, including:
    1. Facilitation or promotion of illegal activities or violations of law, such as:
      1. Promoting or generating content related to child sexual abuse or exploitation;
      2. Promoting or facilitating sale of, or providing instructions for synthesizing or accessing, illegal substances, goods, or services;
      3. Facilitating or encouraging users to commit any type of crimes; or
      4. Promoting or generating violent extremism or terrorist content.
    2. Engagement in the illegal or unlicensed practice of any vocation or profession including, but not limited to, legal, medical, accounting, or financial professional practices.
    3. Abuse, harm, interference, or disruption of services (or enable others to do the same), such as:
      1. Promoting or facilitating the generation or distribution of spam; or
      2. Generating content for deceptive or fraudulent activities, scams, phishing, or malware.
    4. Attempts to override or circumvent safety filters or intentionally drive Gemma or Model Derivatives to act in a manner that contravenes this Gemma Prohibited Use Policy.
    5. Generation of content that may harm or promote the harm of individuals or a group, such as:
      1. Generating content that promotes or encourages hatred;
      2. Facilitating methods of harassment or bullying to intimidate, abuse, or insult others;
      3. Generating content that facilitates, promotes, or incites violence;
      4. Generating content that facilitates, promotes, or encourages self harm;
      5. Generating personally identifying information for distribution or other harms;
      6. Tracking or monitoring people without their consent;
      7. Generating content that may have unfair or adverse impacts on people, particularly impacts related to sensitive or protected characteristics; or
      8. Generating, gathering, processing, or inferring sensitive personal or private information about individuals without obtaining all rights, authorizations, and consents required by applicable laws.
  3. Generate and distribute content intended to misinform, misrepresent or mislead, including:
    1. Misrepresentation of the provenance of generated content by claiming content was created by a human, or represent generated content as original works, in order to deceive;
    2. Generation of content that impersonates an individual (living or dead) without explicit disclosure, in order to deceive;
    3. Misleading claims of expertise or capability made particularly in sensitive areas (e.g. health, finance, government services, or legal);
    4. Making automated decisions in domains that affect material or individual rights or well-being (e.g., finance, legal, employment, healthcare, housing, insurance, and social welfare);
    5. Generation of defamatory content, including defamatory statements, images, or audio content; or
    6. Engaging in the unauthorized or unlicensed practice of any profession including, but not limited to, financial, legal, medical/health, or related professional practices.
  4. Generate sexually explicit content, including content created for the purposes of pornography or sexual gratification (e.g. sexual chatbots). Note that this does not include content created for scientific, educational, documentary, or artistic purposes.

Your page may be loading slowly because you're building optimized sources. If you intended on using uncompiled sources, please click this link.

Hide the shortcuts helper

Google Cloud Console has failed to load JavaScript sources from www.gstatic.com.
Possible reasons are:

  • www.gstatic.com or its IP addresses are blocked by your network administrator
  • Google has temporarily blocked your account or network due to excessive automated requests
Please contact your network administrator for further assistance.

Hide Vertex AI navigation menu (visual change only)
Vertex AI
Field(s) is missing
Customize/fine tune this model using Colab Enterprise