Google Veo 3 API: How to Access and Integrate AI Video Generation in 2026

Complete guide to the Veo 3 API: authentication, Python SDK, Node.js, REST API, async operations, cost management, and production best practices.

E

Emma Chen · 10 min read · Apr 17, 2026

Google Veo 3 API: How to Access and Integrate AI Video Generation in 2026

Google Veo 3 API: How to Access and Integrate AI Video Generation in 2026

Google Veo 3 is one of the most capable AI video generation models available in 2026. For developers and businesses who want to integrate Veo 3's capabilities into their own applications, the Google Cloud Vertex AI API provides programmatic access. This guide covers everything you need to know about accessing the Veo 3 API, authentication, making requests, and handling responses.

What Is the Veo 3 API?

The Veo 3 API is available through Google Cloud's Vertex AI platform. It allows developers to generate videos programmatically by sending text prompts (and optionally reference images) to Google's servers and receiving generated video files in return.

The API is designed for enterprise and developer use cases where you need to:

  • Integrate AI video generation into your own application
  • Automate video production at scale
  • Build products and services on top of Veo 3's capabilities
  • Process large batches of video generation requests

Prerequisites

Before you can use the Veo 3 API, you need:

  1. A Google Cloud account with billing enabled
  2. A Google Cloud project with the Vertex AI API enabled
  3. Appropriate IAM permissions (Vertex AI User role minimum)
  4. API credentials (service account key or Application Default Credentials)
  5. Veo 3 access — the model may require allowlist approval depending on your region and use case

Enabling the Vertex AI API

# Enable the Vertex AI API for your project
gcloud services enable aiplatform.googleapis.com --project=YOUR_PROJECT_ID

Setting Up Authentication

The recommended authentication method for production use is a service account:

# Create a service account
gcloud iam service-accounts create veo3-api-user \
  --display-name="Veo 3 API User" \
  --project=YOUR_PROJECT_ID

# Grant Vertex AI User role
gcloud projects add-iam-policy-binding YOUR_PROJECT_ID \
  --member="serviceAccount:veo3-api-user@YOUR_PROJECT_ID.iam.gserviceaccount.com" \
  --role="roles/aiplatform.user"

# Create and download key
gcloud iam service-accounts keys create veo3-key.json \
  --iam-account=veo3-api-user@YOUR_PROJECT_ID.iam.gserviceaccount.com

For local development, Application Default Credentials (ADC) is simpler:

gcloud auth application-default login

Making Your First API Request

Python SDK

The Google Cloud Python SDK is the recommended way to interact with the Veo 3 API:

pip install google-cloud-aiplatform

Basic text-to-video generation:

import vertexai
from vertexai.preview.vision_models import VideoGenerationModel

# Initialize Vertex AI
vertexai.init(project="YOUR_PROJECT_ID", location="us-central1")

# Load the Veo 3 model
model = VideoGenerationModel.from_pretrained("veo-3.0-generate-preview")

# Generate a video
operation = model.generate_video(
    prompt="A golden retriever puppy playing in autumn leaves, slow motion, cinematic, warm tones",
    output_gcs_uri="gs://YOUR_BUCKET/output/",
    duration_seconds=8,
    aspect_ratio="16:9",
    resolution="1080p"
)

# Wait for completion
response = operation.result(timeout=300)
print(f"Video generated: {response.generated_videos[0].video.uri}")

REST API

For languages without a Google Cloud SDK, you can use the REST API directly:

# Get access token
ACCESS_TOKEN=$(gcloud auth print-access-token)

# Submit generation request
curl -X POST \
  "https://us-central1-aiplatform.googleapis.com/v1/projects/YOUR_PROJECT_ID/locations/us-central1/publishers/google/models/veo-3.0-generate-preview:predictLongRunning" \
  -H "Authorization: Bearer $ACCESS_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "instances": [{
      "prompt": "Aerial view of a mountain lake at sunrise, mist on the water, slow drone descent",
      "duration_seconds": 8,
      "aspect_ratio": "16:9"
    }],
    "parameters": {
      "output_gcs_uri": "gs://YOUR_BUCKET/output/",
      "resolution": "1080p"
    }
  }'

The response includes an operation name that you poll for completion:

# Poll for completion
curl -X GET \
  "https://us-central1-aiplatform.googleapis.com/v1/OPERATION_NAME" \
  -H "Authorization: Bearer $ACCESS_TOKEN"

API Parameters Reference

Required Parameters

Parameter Type Description
prompt string Text description of the video to generate
output_gcs_uri string Google Cloud Storage URI for output

Optional Parameters

Parameter Type Default Description
duration_seconds integer 8 Video length (1-8 seconds)
aspect_ratio string "16:9" "16:9", "9:16", "1:1", "4:3"
resolution string "720p" "720p" or "1080p"
negative_prompt string null Elements to exclude from generation
seed integer random For reproducible results
guidance_scale float 7.5 Prompt adherence strength (1-20)

Image-to-Video Parameters

For image-to-video generation, add the reference image:

from vertexai.preview.vision_models import Image

# Load reference image
reference_image = Image.load_from_file("reference.jpg")

# Generate video from image
operation = model.generate_video(
    prompt="The scene comes to life, gentle breeze, cinematic",
    image=reference_image,
    output_gcs_uri="gs://YOUR_BUCKET/output/",
    duration_seconds=8
)

Handling Asynchronous Operations

Veo 3 video generation is asynchronous — you submit a request and poll for completion. Here's a robust implementation:

import time
import vertexai
from vertexai.preview.vision_models import VideoGenerationModel
from google.api_core import exceptions

def generate_video_with_retry(prompt: str, output_uri: str, max_retries: int = 3):
    vertexai.init(project="YOUR_PROJECT_ID", location="us-central1")
    model = VideoGenerationModel.from_pretrained("veo-3.0-generate-preview")
    
    for attempt in range(max_retries):
        try:
            operation = model.generate_video(
                prompt=prompt,
                output_gcs_uri=output_uri,
                duration_seconds=8,
                aspect_ratio="16:9"
            )
            
            # Poll with exponential backoff
            wait_time = 30
            while not operation.done():
                print(f"Waiting {wait_time}s for generation...")
                time.sleep(wait_time)
                wait_time = min(wait_time * 1.5, 120)
            
            response = operation.result()
            return response.generated_videos[0].video.uri
            
        except exceptions.ResourceExhausted:
            if attempt < max_retries - 1:
                print(f"Rate limited, waiting 60s before retry {attempt + 2}...")
                time.sleep(60)
            else:
                raise
    
    return None

Downloading Generated Videos

Generated videos are stored in Google Cloud Storage. Download them programmatically:

from google.cloud import storage

def download_video(gcs_uri: str, local_path: str):
    client = storage.Client()
    
    # Parse GCS URI
    bucket_name = gcs_uri.split("/")[2]
    blob_name = "/".join(gcs_uri.split("/")[3:])
    
    bucket = client.bucket(bucket_name)
    blob = bucket.blob(blob_name)
    blob.download_to_filename(local_path)
    print(f"Downloaded to {local_path}")

# Usage
download_video(
    "gs://YOUR_BUCKET/output/generated_video.mp4",
    "/local/path/video.mp4"
)

Batch Processing

For generating multiple videos efficiently:

import asyncio
from concurrent.futures import ThreadPoolExecutor

prompts = [
    "Mountain sunrise, slow aerial drift, golden tones",
    "Ocean waves in slow motion, emerald water",
    "City street at night, neon reflections, rain",
    "Forest path in autumn, leaves falling, dappled light",
    "Product on white background, slow rotation, studio lighting"
]

def generate_single(prompt_and_index):
    index, prompt = prompt_and_index
    output_uri = f"gs://YOUR_BUCKET/batch/video_{index:03d}.mp4"
    return generate_video_with_retry(prompt, output_uri)

# Process in parallel (respect rate limits)
with ThreadPoolExecutor(max_workers=3) as executor:
    results = list(executor.map(generate_single, enumerate(prompts)))

print(f"Generated {len([r for r in results if r])} videos successfully")

Rate Limits and Quotas

Veo 3 API has rate limits that vary by project tier:

Tier Requests/minute Concurrent operations
Default 5 3
Elevated 20 10
Enterprise Custom Custom

To request elevated quotas, submit a quota increase request through the Google Cloud Console.

Cost Estimation

Veo 3 API pricing is based on video duration generated:

  • Standard (720p): ~$0.35 per second of video
  • High quality (1080p): ~$0.50 per second of video

For a batch of 100 eight-second clips at 1080p: approximately $400.

For cost-sensitive applications, consider using Seedance's API as an alternative — it offers comparable quality at lower cost with commercial rights included.

Error Handling

Common errors and how to handle them:

from google.api_core import exceptions

try:
    operation = model.generate_video(prompt=prompt, output_gcs_uri=output_uri)
    response = operation.result(timeout=300)
except exceptions.ResourceExhausted as e:
    # Rate limit hit — implement exponential backoff
    print(f"Rate limited: {e}")
except exceptions.InvalidArgument as e:
    # Bad prompt or parameters
    print(f"Invalid request: {e}")
except exceptions.PermissionDenied as e:
    # Authentication or authorization issue
    print(f"Permission denied: {e}")
except TimeoutError:
    # Generation took too long
    print("Generation timed out — check operation status manually")

Production Best Practices

Prompt Engineering for API Use

When using the API programmatically, prompt quality directly affects output quality:

def build_prompt(subject: str, action: str, environment: str, 
                  camera: str = "cinematic", style: str = "photorealistic") -> str:
    return f"{subject} {action}, {environment}, {camera}, {style}"

# Examples
prompts = [
    build_prompt("luxury watch", "slow 360-degree rotation", "black marble surface, soft studio lighting", "close-up, shallow depth of field", "commercial photography quality"),
    build_prompt("young professional", "walking confidently", "modern glass office building lobby", "slow dolly forward", "corporate, natural lighting"),
]

Output Management

import os
from datetime import datetime

def get_output_uri(project_id: str, bucket: str, job_name: str) -> str:
    timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
    return f"gs://{bucket}/veo3/{job_name}/{timestamp}/"

Monitoring and Logging

import logging

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger("veo3_api")

def generate_with_logging(prompt: str, output_uri: str):
    logger.info(f"Starting generation: {prompt[:50]}...")
    start_time = time.time()
    
    result = generate_video_with_retry(prompt, output_uri)
    
    elapsed = time.time() - start_time
    logger.info(f"Generation complete in {elapsed:.1f}s: {result}")
    return result

Alternative: Seedance API

For applications where cost, access restrictions, or commercial rights are concerns, Seedance offers an API with:

  • Lower cost per generation
  • No access restrictions or allowlist requirements
  • Commercial rights included
  • Comparable output quality for most use cases

The Seedance API follows a similar request/response pattern, making it straightforward to switch between providers or use both in the same application.

Learn more about Seedance at seedance.tv →

Cost Management

Veo 3 API pricing is based on video seconds generated. Understanding the cost structure helps you build cost-effective applications.

Pricing Structure

  • Standard resolution (720p): Charged per second of video generated
  • High resolution (1080p): Higher per-second rate
  • Failed generations: Not charged if the generation fails before producing output

Cost Optimization Strategies

1. Use lower resolution for testing

During development and testing, use 720p to reduce costs. Switch to 1080p only for production output.

# Development/testing
resolution = "720p"

# Production
resolution = "1080p"

2. Implement prompt validation before generation

Validate prompts before sending to the API to avoid wasting credits on prompts that will fail content policy checks:

def validate_prompt(prompt: str) -> bool:
    # Check minimum length
    if len(prompt.strip()) < 10:
        return False
    # Check maximum length
    if len(prompt) > 1000:
        return False
    # Add your own validation logic
    return True

3. Cache generated videos

Store generated videos in GCS and implement a caching layer to avoid regenerating the same content:

import hashlib

def get_cache_key(prompt: str, params: dict) -> str:
    content = f"{prompt}:{sorted(params.items())}"
    return hashlib.md5(content.encode()).hexdigest()

def get_or_generate(prompt: str, params: dict) -> str:
    cache_key = get_cache_key(prompt, params)
    cached_uri = f"gs://YOUR_BUCKET/cache/{cache_key}.mp4"
    
    # Check if cached version exists
    if gcs_file_exists(cached_uri):
        return cached_uri
    
    # Generate and cache
    result = generate_video(prompt, cached_uri, params)
    return result

4. Use batch processing for large volumes

For large-scale generation, batch requests to maximize throughput while staying within quota limits:

import asyncio
from concurrent.futures import ThreadPoolExecutor

async def batch_generate(prompts: list, max_concurrent: int = 5):
    semaphore = asyncio.Semaphore(max_concurrent)
    
    async def generate_one(prompt):
        async with semaphore:
            return await asyncio.to_thread(generate_video, prompt)
    
    tasks = [generate_one(p) for p in prompts]
    return await asyncio.gather(*tasks, return_exceptions=True)

Error Handling

Robust error handling is essential for production Veo 3 API integrations:

from google.api_core import exceptions as google_exceptions

def handle_veo3_error(error):
    if isinstance(error, google_exceptions.ResourceExhausted):
        # Quota exceeded — implement exponential backoff
        print("Quota exceeded. Waiting before retry...")
        time.sleep(60)
    elif isinstance(error, google_exceptions.InvalidArgument):
        # Bad request — check prompt and parameters
        print(f"Invalid request: {error.message}")
        raise  # Don't retry invalid requests
    elif isinstance(error, google_exceptions.PermissionDenied):
        # Authentication issue
        print("Permission denied. Check credentials and IAM roles.")
        raise
    elif isinstance(error, google_exceptions.NotFound):
        # Model not available in region
        print("Model not found. Check region and model availability.")
        raise
    else:
        # Unknown error — log and retry
        print(f"Unknown error: {error}")
        raise

Quota Management

Veo 3 API has quota limits that vary by project and region. Key quotas to monitor:

  • Requests per minute: Limits concurrent generation requests
  • Video seconds per day: Total video output per day
  • Storage: GCS storage for generated videos

Monitor your quota usage in the Google Cloud Console under APIs & Services → Quotas.

To request quota increases, submit a request through the Cloud Console. For high-volume use cases, contact Google Cloud sales for enterprise quota arrangements.

Webhooks and Notifications

For production applications, polling for completion is inefficient. Use Pub/Sub notifications instead:

from google.cloud import pubsub_v1

# Set up Pub/Sub topic for notifications
publisher = pubsub_v1.PublisherClient()
topic_path = publisher.topic_path("YOUR_PROJECT_ID", "veo3-completions")

# Configure generation with notification
operation = model.generate_video(
    prompt=prompt,
    output_gcs_uri=output_uri,
    # Pub/Sub notification when complete
    notification_config={
        "pubsub_topic": topic_path
    }
)

Your subscriber receives a message when generation completes, eliminating the need for polling.

Veo 3 API vs Seedance API

For developers evaluating AI video APIs, here's a comparison:

Aspect Veo 3 API Seedance API
Access Restricted (allowlist) Open
Pricing Per-second billing Subscription-based
Output quality Highest Excellent
Latency 30-120 seconds 15-45 seconds
Max duration 8 seconds 8 seconds
Commercial rights Yes (with license) Yes (free tier)
Documentation Comprehensive Good
SDK support Python, Java, Go, Node.js REST API

For applications where Veo 3's quality is essential and access is available, the Veo 3 API is the premium choice. For applications that need open access, faster iteration, and lower cost, Seedance's API is a strong alternative.

Explore Seedance as an accessible alternative at seedance.tv →

Frequently Asked Questions

Is the Veo 3 API publicly available? The Veo 3 API is available through Google Cloud Vertex AI, but access may require allowlist approval depending on your region and use case. Apply through the Google Cloud Console.

What programming languages are supported? Google Cloud provides SDKs for Python, Java, Go, Node.js, and .NET. The REST API works with any language that can make HTTP requests.

How long does video generation take? Typically 30-120 seconds depending on server load, video length, and resolution. Implement appropriate timeouts in your application.

Can I generate videos longer than 8 seconds? The current maximum is 8 seconds per generation. For longer videos, generate multiple clips and stitch them together in post-processing.

What happens if generation fails? Failed generations are not charged. Implement retry logic with exponential backoff for transient failures.

Is there a free tier for the Veo 3 API? Google Cloud offers $300 in free credits for new accounts, which can be used for Veo 3 API calls. There is no ongoing free tier — all usage is billed after credits are exhausted.

What are the content policy restrictions? Veo 3 follows Google's content policies. Prompts that violate these policies will be rejected. Review the Vertex AI content policy documentation for details.

Can I use generated videos commercially? Yes, with appropriate licensing. Review the Google Cloud terms of service and Vertex AI usage policies for commercial use requirements.

The Veo 3 API is a powerful tool for developers building AI video applications. The combination of Google's model quality, comprehensive SDK support, and enterprise-grade infrastructure makes it a strong choice for production applications where quality is the top priority.

For developers who need open access without allowlist requirements, Seedance provides a capable alternative with a straightforward API and free commercial rights.

Start building with Seedance's open API at seedance.tv →

Getting Started Checklist

Before deploying a Veo 3 API integration to production, verify:

  • [ ] Google Cloud project created with billing enabled
  • [ ] Vertex AI API enabled
  • [ ] Service account created with appropriate IAM roles
  • [ ] Veo 3 model access confirmed (allowlist if required)
  • [ ] GCS bucket created for output storage
  • [ ] Error handling implemented for all API error types
  • [ ] Retry logic with exponential backoff implemented
  • [ ] Quota monitoring configured
  • [ ] Cost alerts set up in Cloud Billing
  • [ ] Content policy review completed
  • [ ] Commercial licensing requirements reviewed

With these items checked, your Veo 3 API integration is ready for production use.

Start with the Python SDK examples in this guide, test with a small number of generations, then scale up as you validate your integration.

Node.js Integration Example

For teams using Node.js, here's a complete integration example:

const { VertexAI } = require('@google-cloud/vertexai');
const { Storage } = require('@google-cloud/storage');

const vertexAI = new VertexAI({
  project: process.env.GOOGLE_CLOUD_PROJECT,
  location: 'us-central1'
});

const storage = new Storage();

async function generateVideo(prompt, outputBucket, outputPath) {
  const model = vertexAI.preview.getGenerativeModel({
    model: 'veo-3.0-generate-preview'
  });

  const request = {
    contents: [{
      role: 'user',
      parts: [{ text: prompt }]
    }],
    generationConfig: {
      duration: 8,
      aspectRatio: '16:9',
      resolution: '1080p',
      outputGcsUri: `gs://${outputBucket}/${outputPath}`
    }
  };

  // Submit long-running operation
  const [operation] = await model.generateVideoLongRunning(request);
  
  // Poll for completion
  const [response] = await operation.promise();
  
  const videoUri = response.candidates[0].content.parts[0].fileData.fileUri;
  console.log(`Video generated: ${videoUri}`);
  return videoUri;
}

// Usage
generateVideo(
  'Aerial view of a coastal city at sunset, slow drone descent, cinematic',
  'my-video-bucket',
  'output/coastal-sunset.mp4'
).catch(console.error);

Monitoring and Observability

For production deployments, implement monitoring to track API performance and costs:

import time
import logging
from dataclasses import dataclass
from typing import Optional

@dataclass
class GenerationMetrics:
    prompt: str
    start_time: float
    end_time: Optional[float] = None
    success: bool = False
    error: Optional[str] = None
    output_uri: Optional[str] = None
    
    @property
    def duration_seconds(self):
        if self.end_time:
            return self.end_time - self.start_time
        return None

def generate_with_metrics(prompt: str, output_uri: str) -> GenerationMetrics:
    metrics = GenerationMetrics(prompt=prompt, start_time=time.time())
    
    try:
        result = generate_video(prompt, output_uri)
        metrics.success = True
        metrics.output_uri = result
        metrics.end_time = time.time()
        
        logging.info(f"Generation successful: {metrics.duration_seconds:.1f}s")
        
    except Exception as e:
        metrics.success = False
        metrics.error = str(e)
        metrics.end_time = time.time()
        
        logging.error(f"Generation failed after {metrics.duration_seconds:.1f}s: {e}")
    
    return metrics

Log these metrics to Cloud Monitoring or your preferred observability platform to track generation success rates, latency trends, and cost per generation over time.

Regional Availability

Veo 3 is available in select Google Cloud regions. As of 2026, the primary supported regions are:

  • us-central1 (Iowa): Primary region with full feature support
  • us-east4 (Northern Virginia): Available for US-based applications
  • europe-west4 (Netherlands): Available for European applications
  • asia-northeast1 (Tokyo): Available for Asia-Pacific applications

For the most current list of supported regions, check the Vertex AI documentation. If your preferred region is not supported, you can generate in a supported region and transfer the output to your preferred storage location.

Regional selection affects latency — choose the region closest to your application servers for the best performance. For global applications, consider generating in multiple regions and routing requests to the nearest available region.

Security Best Practices

When integrating the Veo 3 API in production, follow these security best practices:

Never hardcode credentials: Use environment variables or secret management services (Google Secret Manager, HashiCorp Vault) to store API credentials.

Use service accounts with minimal permissions: Grant only the permissions required for video generation. The Vertex AI User role is sufficient for most use cases.

Rotate credentials regularly: Implement a credential rotation schedule and automate the rotation process where possible.

Validate and sanitize prompts: If your application accepts user-provided prompts, validate and sanitize them before sending to the API to prevent prompt injection attacks.

Implement rate limiting: Protect your application from abuse by implementing rate limiting on the endpoints that trigger video generation.

Audit logging: Enable Cloud Audit Logs to track all API calls for security and compliance purposes.

Following these practices ensures your Veo 3 integration is secure and compliant with enterprise security requirements.

Explore Seedance as an accessible alternative at seedance.tv →

Ready to create AI videos?
Turn ideas and images into finished videos with the core Veo3 AI tools.

Related Articles

Continue with more blog posts in the same locale.

Browse all posts