CountTokens API¶
The CountTokens API calculates the number of input tokens before sending a request to the Gemini API.
Use the CountTokens API to prevent requests from exceeding the model context window, and estimate potential costs based on billable characters.
The CountTokens API can use the same contents
parameter as Gemini API
inference requests.
Supported models¶
Parameter list¶
This class consists of two main properties: role
and parts
. The role
property denotes the individual producing the content, while the parts
property contains multiple elements, each representing a segment of data within a
message.
Parameters | |
---|---|
role |
Optional: string The identity of the entity that creates the message. Set the string to one of the following: - user : This indicates that the message is sent by a real person. For example, a user-generated message. - model : This indicates that the message is generated by the model. The model value is used to insert messages from the model into the conversation during multi-turn conversations. For non-multi-turn conversations, this field can be left blank or unset. |
parts |
part A list of ordered parts that make up a single message. Different parts may have different IANA MIME types. |
Part
¶
A data type containing media that is part of a multi-part Content
message.
Parameters | |
---|---|
text |
Optional: string A text prompt or code snippet. |
inline_data |
Optional: Blob Inline data in raw bytes. |
file_data |
Optional: FileData Data stored in a file. |
Blob
¶
Content blob. If possible this send as text rather than raw bytes.
Parameters | |
---|---|
mime_type |
string IANA MIME type of the data. |
data |
bytes Raw bytes. |
FileData
¶
URI based data.
Parameters | |
---|---|
mime_type |
string IANA MIME type of the data. |
file_uri |
string The Cloud Storage URI to the file storing the data. |
system_instruction
¶
This field is for user provided system_instructions
. It is the same
as contents
but with a limited support of the content types.
Parameters | |
---|---|
role |
string IANA MIME type of the data. This field is ignored internally. |
parts |
Part Text only. Instructions that users want to pass to the model. |
FunctionDeclaration
¶
A structured representation of a function declaration as defined by the OpenAPI 3.0 specification that represents a function the model may generate JSON inputs for.
Parameters | |
---|---|
name |
string The name of the function to call. |
description |
Optional: string Description and purpose of the function. |
parameters |
Optional: Schema Describes the parameters of the function in the OpenAPI JSON Schema Object format: OpenAPI 3.0 specification. |
response |
Optional: Schema Describes the output from the function in the OpenAPI JSON Schema Object format: OpenAPI 3.0 specification. |
Examples¶
Get token count from text prompt¶
This example counts the tokens of a single text prompt:
REST¶
To get the token count and the number of billable characters for a prompt by
using the Vertex AI API, send a POST
request to the publisher model
endpoint.
Before using any of the request data, make the following replacements:
- LOCATION: The region to process the request. Available options include the following:
Click to expand a partial list of available regions
us-central1
us-west4
northamerica-northeast1
us-east4
us-west1
asia-northeast3
asia-southeast1
asia-northeast1
- PROJECT_ID: Your project ID.
- MODEL_ID: The model ID of the multimodal model that you want to use.
- ROLE: The role in a conversation associated with the content. Specifying a role is required even in singleturn use cases. Acceptable values include the following:
USER
: Specifies content that's sent by you.- TEXT: The text instructions to include in the prompt.
HTTP method and URL:
POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:countTokens
Request JSON body:
To send your request, choose one of these options:
curl¶
Note:
The following command assumes that you have logged in to
the gcloud
CLI with your user account by running
gcloud init
or
gcloud auth login
, or by using Cloud Shell,
which automatically logs you into the gcloud
CLI
.
You can check the currently active account by running
gcloud auth list
.
Save the request body in a file named request.json
,
and execute the following command:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:countTokens"
PowerShell¶
Note:
The following command assumes that you have logged in to
the gcloud
CLI with your user account by running
gcloud init
or
gcloud auth login
.
You can check the currently active account by running
gcloud auth list
.
Save the request body in a file named request.json
,
and execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:countTokens" | Select-Object -Expand Content
You should receive a JSON response similar to the following.
Response¶
Gen AI SDK for Python¶
Install¶
To learn more, see the SDK reference documentation.
Set environment variables to use the Gen AI SDK with Vertex AI:
# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values
# with appropriate values for your project.
export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT
export GOOGLE_CLOUD_LOCATION=global
export GOOGLE_GENAI_USE_VERTEXAI=True
from google import genai
from google.genai.types import HttpOptions
client = genai.Client(http_options=HttpOptions(api_version="v1"))
response = client.models.count_tokens(
model="gemini-2.0-flash-001",
contents="What's the highest mountain in Africa?",
)
print(response)
# Example output:
# total_tokens=10
# cached_content_token_count=None
Gen AI SDK for Go¶
Learn how to install or update the Gen AI SDK for Go.
To learn more, see the SDK reference documentation.
Set environment variables to use the Gen AI SDK with Vertex AI:
# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values
# with appropriate values for your project.
export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT
export GOOGLE_CLOUD_LOCATION=global
export GOOGLE_GENAI_USE_VERTEXAI=True
import (
"context"
"fmt"
"io"
genai "google.golang.org/genai"
)
// countWithTxt shows how to count tokens with text input.
func countWithTxt(w io.Writer) error {
ctx := context.Background()
client, err := genai.NewClient(ctx, &genai.ClientConfig{
HTTPOptions: genai.HTTPOptions{APIVersion: "v1"},
})
if err != nil {
return fmt.Errorf("failed to create genai client: %w", err)
}
modelName := "gemini-2.0-flash-001"
contents := []*genai.Content{
{Parts: []*genai.Part{
{Text: "What's the highest mountain in Africa?"},
}},
}
resp, err := client.Models.CountTokens(ctx, modelName, contents, nil)
if err != nil {
return fmt.Errorf("failed to generate content: %w", err)
}
fmt.Fprintf(w, "Total: %d\nCached: %d\n", resp.TotalTokens, resp.CachedContentTokenCount)
// Example response:
// Total: 9
// Cached: 0
return nil
}
const {VertexAI} = require('@google-cloud/vertexai');
/**
* TODO(developer): Update these variables before running the sample.
*/
async function countTokens(
projectId = 'PROJECT_ID',
location = 'us-central1',
model = 'gemini-2.0-flash-001'
) {
// Initialize Vertex with your Cloud project and location
const vertexAI = new VertexAI({project: projectId, location: location});
// Instantiate the model
const generativeModel = vertexAI.getGenerativeModel({
model: model,
});
const req = {
contents: [{role: 'user', parts: [{text: 'How are you doing today?'}]}],
};
// Prompt tokens count
const countTokensResp = await generativeModel.countTokens(req);
console.log('Prompt tokens count: ', countTokensResp);
// Send text to gemini
const result = await generativeModel.generateContent(req);
// Response tokens count
const usageMetadata = result.response.usageMetadata;
console.log('Response tokens count: ', usageMetadata);
}
import com.google.cloud.vertexai.VertexAI;
import com.google.cloud.vertexai.api.CountTokensResponse;
import com.google.cloud.vertexai.api.GenerateContentResponse;
import com.google.cloud.vertexai.generativeai.GenerativeModel;
import java.io.IOException;
public class GetTokenCount {
public static void main(String[] args) throws IOException {
// TODO(developer): Replace these variables before running the sample.
String projectId = "your-google-cloud-project-id";
String location = "us-central1";
String modelName = "gemini-2.0-flash-001";
getTokenCount(projectId, location, modelName);
}
// Gets the number of tokens for the prompt and the model's response.
public static int getTokenCount(String projectId, String location, String modelName)
throws IOException {
// Initialize client that will be used to send requests.
// This client only needs to be created once, and can be reused for multiple requests.
try (VertexAI vertexAI = new VertexAI(projectId, location)) {
GenerativeModel model = new GenerativeModel(modelName, vertexAI);
String textPrompt = "Why is the sky blue?";
CountTokensResponse response = model.countTokens(textPrompt);
int promptTokenCount = response.getTotalTokens();
int promptCharCount = response.getTotalBillableCharacters();
System.out.println("Prompt token Count: " + promptTokenCount);
System.out.println("Prompt billable character count: " + promptCharCount);
GenerateContentResponse contentResponse = model.generateContent(textPrompt);
int tokenCount = contentResponse.getUsageMetadata().getPromptTokenCount();
int candidateTokenCount = contentResponse.getUsageMetadata().getCandidatesTokenCount();
int totalTokenCount = contentResponse.getUsageMetadata().getTotalTokenCount();
System.out.println("Prompt token Count: " + tokenCount);
System.out.println("Candidate Token Count: " + candidateTokenCount);
System.out.println("Total token Count: " + totalTokenCount);
return promptTokenCount;
}
}
}
Get token count from media prompt¶
This example counts the tokens of a prompt that uses various media types.
REST¶
To get the token count and the number of billable characters for a prompt by
using the Vertex AI API, send a POST
request to the publisher model
endpoint.
Before using any of the request data, make the following replacements:
- LOCATION: The region to process the request. Available options include the following:
Click to expand a partial list of available regions
us-central1
us-west4
northamerica-northeast1
us-east4
us-west1
asia-northeast3
asia-southeast1
asia-northeast1
- PROJECT_ID: Your project ID.
- MODEL_ID: The model ID of the multimodal model that you want to use.
- ROLE: The role in a conversation associated with the content. Specifying a role is required even in singleturn use cases. Acceptable values include the following:
USER
: Specifies content that's sent by you.- TEXT: The text instructions to include in the prompt.
- FILE_URI: The URI or URL of the file to include in the prompt. Acceptable values include the following:
- Cloud Storage bucket URI: The object must either be publicly readable or reside in
the same Google Cloud project that's sending the request. For
gemini-2.0-flash
andgemini-2.0-flash-lite
, the size limit is 2 GB. - HTTP URL: The file URL must be publicly readable. You can specify one video file, one audio file, and up to 10 image files per request. Audio files, video files, and documents can't exceed 15 MB.
- YouTube video URL:The YouTube video must be either owned by the account that you used to sign in to the Google Cloud console or is public. Only one YouTube video URL is supported per request.
When specifying a fileURI
, you must also specify the media type
(mimeType
) of the file. If VPC Service Controls is enabled, specifying a media file
URL for fileURI
is not supported.
- MIME_TYPE:
The media type of the file specified in the data
or fileUri
fields. Acceptable values include the following:
Click to expand MIME types
application/pdf
audio/mpeg
audio/mp3
audio/wav
image/png
image/jpeg
image/webp
text/plain
video/mov
video/mpeg
video/mp4
video/mpg
video/avi
video/wmv
video/mpegps
video/flv
HTTP method and URL:
POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:countTokens
Request JSON body:
{
"contents": [{
"role": "ROLE",
"parts": [
{
"file_data": {
"file_uri": "FILE_URI",
"mime_type": "MIME_TYPE"
}
},
{
"text": "TEXT
}
]
}]
}
To send your request, choose one of these options:
curl¶
Note:
The following command assumes that you have logged in to
the gcloud
CLI with your user account by running
gcloud init
or
gcloud auth login
, or by using Cloud Shell,
which automatically logs you into the gcloud
CLI
.
You can check the currently active account by running
gcloud auth list
.
Save the request body in a file named request.json
,
and execute the following command:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:countTokens"
PowerShell¶
Note:
The following command assumes that you have logged in to
the gcloud
CLI with your user account by running
gcloud init
or
gcloud auth login
.
You can check the currently active account by running
gcloud auth list
.
Save the request body in a file named request.json
,
and execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:countTokens" | Select-Object -Expand Content
You should receive a JSON response similar to the following.
Response¶
Gen AI SDK for Python¶
Install¶
To learn more, see the SDK reference documentation.
Set environment variables to use the Gen AI SDK with Vertex AI:
# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values
# with appropriate values for your project.
export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT
export GOOGLE_CLOUD_LOCATION=global
export GOOGLE_GENAI_USE_VERTEXAI=True
from google import genai
from google.genai.types import HttpOptions, Part
client = genai.Client(http_options=HttpOptions(api_version="v1"))
contents = [
Part.from_uri(
file_uri="gs://cloud-samples-data/generative-ai/video/pixel8.mp4",
mime_type="video/mp4",
),
"Provide a description of the video.",
]
response = client.models.count_tokens(
model="gemini-2.0-flash-001",
contents=contents,
)
print(response)
# Example output:
# total_tokens=16252 cached_content_token_count=None
Gen AI SDK for Go¶
Learn how to install or update the Gen AI SDK for Go.
To learn more, see the SDK reference documentation.
Set environment variables to use the Gen AI SDK with Vertex AI:
# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values
# with appropriate values for your project.
export GOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECT
export GOOGLE_CLOUD_LOCATION=global
export GOOGLE_GENAI_USE_VERTEXAI=True
import (
"context"
"fmt"
"io"
genai "google.golang.org/genai"
)
// countWithTxtAndVid shows how to count tokens with text and video inputs.
func countWithTxtAndVid(w io.Writer) error {
ctx := context.Background()
client, err := genai.NewClient(ctx, &genai.ClientConfig{
HTTPOptions: genai.HTTPOptions{APIVersion: "v1"},
})
if err != nil {
return fmt.Errorf("failed to create genai client: %w", err)
}
modelName := "gemini-2.0-flash-001"
contents := []*genai.Content{
{Parts: []*genai.Part{
{Text: "Provide a description of the video."},
{FileData: &genai.FileData{
FileURI: "gs://cloud-samples-data/generative-ai/video/pixel8.mp4",
MIMEType: "video/mp4",
}},
}},
}
resp, err := client.Models.CountTokens(ctx, modelName, contents, nil)
if err != nil {
return fmt.Errorf("failed to generate content: %w", err)
}
fmt.Fprintf(w, "Total: %d\nCached: %d\n", resp.TotalTokens, resp.CachedContentTokenCount)
// Example response:
// Total: 16252
// Cached: 0
return nil
}
const {VertexAI} = require('@google-cloud/vertexai');
/**
* TODO(developer): Update these variables before running the sample.
*/
async function countTokens(
projectId = 'PROJECT_ID',
location = 'us-central1',
model = 'gemini-2.0-flash-001'
) {
// Initialize Vertex with your Cloud project and location
const vertexAI = new VertexAI({project: projectId, location: location});
// Instantiate the model
const generativeModel = vertexAI.getGenerativeModel({
model: model,
});
const req = {
contents: [
{
role: 'user',
parts: [
{
file_data: {
file_uri:
'gs://cloud-samples-data/generative-ai/video/pixel8.mp4',
mime_type: 'video/mp4',
},
},
{text: 'Provide a description of the video.'},
],
},
],
};
const countTokensResp = await generativeModel.countTokens(req);
console.log('Prompt Token Count:', countTokensResp.totalTokens);
console.log(
'Prompt Character Count:',
countTokensResp.totalBillableCharacters
);
// Sent text to Gemini
const result = await generativeModel.generateContent(req);
const usageMetadata = result.response.usageMetadata;
console.log('Prompt Token Count:', usageMetadata.promptTokenCount);
console.log('Candidates Token Count:', usageMetadata.candidatesTokenCount);
console.log('Total Token Count:', usageMetadata.totalTokenCount);
}
import com.google.cloud.vertexai.VertexAI;
import com.google.cloud.vertexai.api.Content;
import com.google.cloud.vertexai.api.CountTokensResponse;
import com.google.cloud.vertexai.generativeai.ContentMaker;
import com.google.cloud.vertexai.generativeai.GenerativeModel;
import com.google.cloud.vertexai.generativeai.PartMaker;
import java.io.IOException;
public class GetMediaTokenCount {
public static void main(String[] args) throws IOException {
// TODO(developer): Replace these variables before running the sample.
String projectId = "your-google-cloud-project-id";
String location = "us-central1";
String modelName = "gemini-2.0-flash-001";
getMediaTokenCount(projectId, location, modelName);
}
// Gets the number of tokens for the prompt with text and video and the model's response.
public static int getMediaTokenCount(String projectId, String location, String modelName)
throws IOException {
// Initialize client that will be used to send requests.
// This client only needs to be created once, and can be reused for multiple requests.
try (VertexAI vertexAI = new VertexAI(projectId, location)) {
GenerativeModel model = new GenerativeModel(modelName, vertexAI);
Content content = ContentMaker.fromMultiModalData(
"Provide a description of the video.",
PartMaker.fromMimeTypeAndData(
"video/mp4", "gs://cloud-samples-data/generative-ai/video/pixel8.mp4")
);
CountTokensResponse response = model.countTokens(content);
int tokenCount = response.getTotalTokens();
System.out.println("Token count: " + tokenCount);
return tokenCount;
}
}
}
What's next¶
- Learn more about the Gemini API.