Google models¶
Featured Gemini models¶
2.5 Pro preview
A preview version of our most advanced reasoning model to date
- Input audio, images, video, and text, get text responses
- See the model's thinking process as part of the response
- Best for solving complex coding and reasoning problems
2.0 Flash spark
Our newest multimodal model, with next generation features and improved capabilities
- Input audio, images, video, and text, get text responses
- Generate code and images, extract data, analyze files, generate graphs, and more
- Low latency, enhanced performance, built to power agentic experiences
2.0 Flash-Lite
A Gemini 2.0 Flash model optimized for cost efficiency and low latency
- Input audio, images, video, and text, get text responses
- Outperforms 1.5 Flash on the majority of benchmarks
- A 1 million token context window and multimodal input, like Flash 2.0
Generally available Gemini models¶
spark Gemini 2.0 Flash Our newest multimodal model, with next generation features and improved capabilities
performance_auto Gemini 2.0 Flash-Lite A Gemini 2.0 Flash model optimized for cost efficiency and low latency
Preview Gemini models¶
preview Gemini 2.5 Pro Our most advanced reasoning model to date
preview Gemini 2.5 Flash Gemini 2.5 Flash is a thinking model that offers great, well-rounded capabilities. It is designed to offer a balance between price and performance.
Gemma models¶
Gemma 3 Our latest Gemma open model, featuring the ability to solve a wide variety of tasks with text and image input, support for over 140 languages, and long 128K context window
Gemma 2 The second of generation of our open models featuring text generation, summarization, and extraction
Gemma A small-sized, lightweight open model supporting text generation, summarization, and extraction
ShieldGemma 2 Instruction tuned models for evaluating the safety of text and images against a set of defined safety policies
PaliGemma Our open vision-language model that combines SigLIP and Gemma
CodeGemma Powerful, lightweight open model that can perform a variety of coding tasks like fill-in-the-middle code completion, code generation, natural language understanding, mathematical reasoning, and instruction following
TxGemma Generates predictions, classifications or text based on therapeutic related data and can be used to efficiently build AI models for therapeutic-related tasks with less data and less compute
Embeddings models¶
width_normal Embeddings for Text Converts text data into vector representations for semantic search, classification, clustering, and similar tasks
width_normal Multimodal Embeddings Generates vectors based on images, which can be used for downstream tasks like image classification, image search, and more
Imagen models¶
photo_spark Imagen 3 for Generation Use text prompts to generate novel images
image_edit_auto Imagen 3 for Editing and Customization Use text prompts to edit existing input images, or parts of an image with a mask or generate new images based upon the context provided by input reference images
photo_spark Imagen 3 for Fast Generation Use text prompts to generate novel images with lower latency than our other image generation models
subtitles Imagen for Captioning & VQA Use text prompts to generative novel images, edit existing ones, edit parts of an image with a mask and more
Veo models¶
movie Veo 2 for Generation Use text prompts and images to generate novel videos
MedLM models¶
medical_information MedLM-medium HIPAA-compliant suite of medically tuned models designed to help healthcare practitioners with medical question and answer tasks, and summarization tasks for healthcare and medical documents
clinical_notes MedLM-large-large HIPAA-compliant suite of medically tuned models designed to help healthcare practitioners with medical question and answer tasks, and summarization tasks for healthcare and medical documents
Language support¶
Gemini¶
All the Gemini models can understand and respond in the following languages:
Arabic (ar), Bengali (bn), Bulgarian (bg), Chinese (Simplified and Traditional) (zh), Croatian (hr), Czech (cs), Danish (da), Dutch (nl), English (en), Estonian (et), Finnish (fi), French (fr), German (de), Greek (el), Hebrew (iw), Hindi (hi), Hungarian (hu), Indonesian (id), Italian (it), Japanese (ja), Korean (ko), Latvian (lv), Lithuanian (lt), Norwegian (no), Polish (pl), Portuguese (pt), Romanian (ro), Russian (ru), Serbian (sr), Slovak (sk), Slovenian (sl), Spanish (es), Swahili (sw), Swedish (sv), Thai (th), Turkish (tr), Ukrainian (uk), Vietnamese (vi)
Gemini 2.0 Flash, Gemini 1.5 Pro and Gemini 1.5 Flash models can understand and respond in the following additional languages:
Afrikaans (af), Amharic (am), Assamese (as), Azerbaijani (az), Belarusian (be), Bosnian (bs), Catalan (ca), Cebuano (ceb), Corsican (co), Welsh (cy), Dhivehi (dv), Esperanto (eo), Basque (eu), Persian (fa), Filipino (Tagalog) (fil), Frisian (fy), Irish (ga), Scots Gaelic (gd), Galician (gl), Gujarati (gu), Hausa (ha), Hawaiian (haw), Hmong (hmn), Haitian Creole (ht), Armenian (hy), Igbo (ig), Icelandic (is), Javanese (jv), Georgian (ka), Kazakh (kk), Khmer (km), Kannada (kn), Krio (kri), Kurdish (ku), Kyrgyz (ky), Latin (la), Luxembourgish (lb), Lao (lo), Malagasy (mg), Maori (mi), Macedonian (mk), Malayalam (ml), Mongolian (mn), Meiteilon (Manipuri) (mni-Mtei), Marathi (mr), Malay (ms), Maltese (mt), Myanmar (Burmese) (my), Nepali (ne), Nyanja (Chichewa) (ny), Odia (Oriya) (or), Punjabi (pa), Pashto (ps), Sindhi (sd), Sinhala (Sinhalese) (si), Samoan (sm), Shona (sn), Somali (so), Albanian (sq), Sesotho (st), Sundanese (su), Tamil (ta), Telugu (te), Tajik (tg), Uyghur (ug), Urdu (ur), Uzbek (uz), Xhosa (xh), Yiddish (yi), Yoruba (yo), Zulu (zu)
Gemma¶
Gemma supports only the English language.
Embeddings¶
Multilingual text embedding models support the following languages:
Afrikaans, Albanian, Amharic, Arabic, Armenian, Azerbaijani, Basque, Belarusian, Bengali, Bulgarian, Burmese, Catalan, Cebuano, Chichewa, Chinese, Corsican, Czech, Danish, Dutch, English, Esperanto, Estonian, Filipino, Finnish, French, Galician, Georgian, German, Greek, Gujarati, Haitian Creole, Hausa, Hawaiian, Hebrew, Hindi, Hmong, Hungarian, Icelandic, Igbo, Indonesian, Irish, Italian, Japanese, Javanese, Kannada, Kazakh, Khmer, Korean, Kurdish, Kyrgyz, Lao, Latin, Latvian, Lithuanian, Luxembourgish, Macedonian, Malagasy, Malay, Malayalam, Maltese, Maori, Marathi, Mongolian, Nepali, Norwegian, Pashto, Persian, Polish, Portuguese, Punjabi, Romanian, Russian, Samoan, Scottish Gaelic, Serbian, Shona, Sindhi, Sinhala, Slovak, Slovenian, Somali, Sotho, Spanish, Sundanese, Swahili, Swedish, Tajik, Tamil, Telugu, Thai, Turkish, Ukrainian, Urdu, Uzbek, Vietnamese, Welsh, West Frisian, Xhosa, Yiddish, Yoruba, Zulu.
Imagen 3¶
Imagen 3 supports the following languages:
English, Chinese, Hindi, Japanese, Korean, Portuguese, and Spanish.
MedLM¶
The MedLM model supports the English language.
Explore all models in Model Garden¶
Model Garden is a platform that helps you discover, test, customize, and deploy Google proprietary and select OSS models and assets. To explore the generative AI models and APIs that are available on Vertex AI, go to Model Garden in the Google Cloud console.
To learn more about Model Garden, including available models and capabilities, see Explore AI models in Model Garden.
Model versions¶
To see all model versions, including legacy and retired models, see Model versions and lifecycle.
What's next¶
- Try a quickstart tutorial using Vertex AI Studio or the Vertex AI API.
- Explore pretrained models in Model Garden.
- Learn how to control access to specific models in Model Garden by using a Model Garden organization policy.
- Learn about pricing.