Imagen for Captioning & VQA
Imagen Captioning generates a relevant description for a given image.Imagen for Captioning and VQA takes an image and either generates captions based on the image or answers a question about the image (visual question-answering).
Imagen currently supports five languages: English, German, French, Spanish and Italian.
You can use Imagen for Captioning and VQA in the Google Cloud console or send a request to the Vertex AI API.
Using the console
Using the SDK / API
Before using any of the request data, make the following replacements:
PROJECT_ID
: Your Google Cloud project ID.B64_IMAGE
: The image to get captions for. The image must be specified as a base64-encoded byte string. Size limit: 10 MB.RESPONSE_COUNT
: The number of image captions you want to generate. Accepted integer values: 1-3.LANGUAGE_CODE
: One of the supported language codes. Languages supported:
en
)fr
)de
)it
)es
)HTTP method and URL:
Request JSON body:
To send your request, save the request body in a file named request.json, and execute the following command:
Example response
The following sample response is for a request with "sampleCount": 2. The response returns two prediction strings.
Using the console
Using the API (curl)
Before using any of the request data, make the following replacements:
PROJECT_ID
: Your Google Cloud project ID.VQA_PROMPT
: The question you want to get answered about your image. For example:
B64_IMAGE
: The image to get captions for. The image must be specified as a base64-encoded byte string. Size limit: 10 MB.RESPONSE_COUNT
: The number of answers you want to generate. Accepted integer values: 1-3.
HTTP method and URL:HTTP method and URL:
Request JSON body:
To send your request, save the request body in a file named request.json, and execute the following command:
Example response
The following sample responses is for a request with "sampleCount": 2 and "prompt": "What is this?". The response returns two prediction string answers.
Resource ID | Release date | Release stage | Description |
---|---|---|---|
imagetext-001 | 2023-07-17 | GA | |
imagetext-001 | 2024-04-01 | General Availability | Additional stable model release with quality upgrades. It doesn't yet support editing or upscaling |
La console Google Cloud non รจ riuscita a caricare le origini JavaScript da www.gstatic.com.
Possibili motivi: