- Blog
- Google Veo 3 API: How to Access and Integrate AI Video Generation in 2026
Google Veo 3 API: How to Access and Integrate AI Video Generation in 2026
Complete guide to the Veo 3 API: authentication, Python SDK, Node.js, REST API, async operations, cost management, and production best practices.
Emma Chen · 10 min read · Apr 17, 2026

Google Veo 3 API: How to Access and Integrate AI Video Generation in 2026
Google Veo 3 is one of the most capable AI video generation models available in 2026. For developers and businesses who want to integrate Veo 3's capabilities into their own applications, the Google Cloud Vertex AI API provides programmatic access. This guide covers everything you need to know about accessing the Veo 3 API, authentication, making requests, and handling responses.
What Is the Veo 3 API?
The Veo 3 API is available through Google Cloud's Vertex AI platform. It allows developers to generate videos programmatically by sending text prompts (and optionally reference images) to Google's servers and receiving generated video files in return.
The API is designed for enterprise and developer use cases where you need to:
- Integrate AI video generation into your own application
- Automate video production at scale
- Build products and services on top of Veo 3's capabilities
- Process large batches of video generation requests
Prerequisites
Before you can use the Veo 3 API, you need:
- A Google Cloud account with billing enabled
- A Google Cloud project with the Vertex AI API enabled
- Appropriate IAM permissions (Vertex AI User role minimum)
- API credentials (service account key or Application Default Credentials)
- Veo 3 access — the model may require allowlist approval depending on your region and use case
Enabling the Vertex AI API
# Enable the Vertex AI API for your project
gcloud services enable aiplatform.googleapis.com --project=YOUR_PROJECT_ID
Setting Up Authentication
The recommended authentication method for production use is a service account:
# Create a service account
gcloud iam service-accounts create veo3-api-user \
--display-name="Veo 3 API User" \
--project=YOUR_PROJECT_ID
# Grant Vertex AI User role
gcloud projects add-iam-policy-binding YOUR_PROJECT_ID \
--member="serviceAccount:veo3-api-user@YOUR_PROJECT_ID.iam.gserviceaccount.com" \
--role="roles/aiplatform.user"
# Create and download key
gcloud iam service-accounts keys create veo3-key.json \
--iam-account=veo3-api-user@YOUR_PROJECT_ID.iam.gserviceaccount.com
For local development, Application Default Credentials (ADC) is simpler:
gcloud auth application-default login
Making Your First API Request
Python SDK
The Google Cloud Python SDK is the recommended way to interact with the Veo 3 API:
pip install google-cloud-aiplatform
Basic text-to-video generation:
import vertexai
from vertexai.preview.vision_models import VideoGenerationModel
# Initialize Vertex AI
vertexai.init(project="YOUR_PROJECT_ID", location="us-central1")
# Load the Veo 3 model
model = VideoGenerationModel.from_pretrained("veo-3.0-generate-preview")
# Generate a video
operation = model.generate_video(
prompt="A golden retriever puppy playing in autumn leaves, slow motion, cinematic, warm tones",
output_gcs_uri="gs://YOUR_BUCKET/output/",
duration_seconds=8,
aspect_ratio="16:9",
resolution="1080p"
)
# Wait for completion
response = operation.result(timeout=300)
print(f"Video generated: {response.generated_videos[0].video.uri}")
REST API
For languages without a Google Cloud SDK, you can use the REST API directly:
# Get access token
ACCESS_TOKEN=$(gcloud auth print-access-token)
# Submit generation request
curl -X POST \
"https://us-central1-aiplatform.googleapis.com/v1/projects/YOUR_PROJECT_ID/locations/us-central1/publishers/google/models/veo-3.0-generate-preview:predictLongRunning" \
-H "Authorization: Bearer $ACCESS_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"instances": [{
"prompt": "Aerial view of a mountain lake at sunrise, mist on the water, slow drone descent",
"duration_seconds": 8,
"aspect_ratio": "16:9"
}],
"parameters": {
"output_gcs_uri": "gs://YOUR_BUCKET/output/",
"resolution": "1080p"
}
}'
The response includes an operation name that you poll for completion:
# Poll for completion
curl -X GET \
"https://us-central1-aiplatform.googleapis.com/v1/OPERATION_NAME" \
-H "Authorization: Bearer $ACCESS_TOKEN"
API Parameters Reference
Required Parameters
| Parameter | Type | Description |
|---|---|---|
prompt |
string | Text description of the video to generate |
output_gcs_uri |
string | Google Cloud Storage URI for output |
Optional Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
duration_seconds |
integer | 8 | Video length (1-8 seconds) |
aspect_ratio |
string | "16:9" | "16:9", "9:16", "1:1", "4:3" |
resolution |
string | "720p" | "720p" or "1080p" |
negative_prompt |
string | null | Elements to exclude from generation |
seed |
integer | random | For reproducible results |
guidance_scale |
float | 7.5 | Prompt adherence strength (1-20) |
Image-to-Video Parameters
For image-to-video generation, add the reference image:
from vertexai.preview.vision_models import Image
# Load reference image
reference_image = Image.load_from_file("reference.jpg")
# Generate video from image
operation = model.generate_video(
prompt="The scene comes to life, gentle breeze, cinematic",
image=reference_image,
output_gcs_uri="gs://YOUR_BUCKET/output/",
duration_seconds=8
)
Handling Asynchronous Operations
Veo 3 video generation is asynchronous — you submit a request and poll for completion. Here's a robust implementation:
import time
import vertexai
from vertexai.preview.vision_models import VideoGenerationModel
from google.api_core import exceptions
def generate_video_with_retry(prompt: str, output_uri: str, max_retries: int = 3):
vertexai.init(project="YOUR_PROJECT_ID", location="us-central1")
model = VideoGenerationModel.from_pretrained("veo-3.0-generate-preview")
for attempt in range(max_retries):
try:
operation = model.generate_video(
prompt=prompt,
output_gcs_uri=output_uri,
duration_seconds=8,
aspect_ratio="16:9"
)
# Poll with exponential backoff
wait_time = 30
while not operation.done():
print(f"Waiting {wait_time}s for generation...")
time.sleep(wait_time)
wait_time = min(wait_time * 1.5, 120)
response = operation.result()
return response.generated_videos[0].video.uri
except exceptions.ResourceExhausted:
if attempt < max_retries - 1:
print(f"Rate limited, waiting 60s before retry {attempt + 2}...")
time.sleep(60)
else:
raise
return None
Downloading Generated Videos
Generated videos are stored in Google Cloud Storage. Download them programmatically:
from google.cloud import storage
def download_video(gcs_uri: str, local_path: str):
client = storage.Client()
# Parse GCS URI
bucket_name = gcs_uri.split("/")[2]
blob_name = "/".join(gcs_uri.split("/")[3:])
bucket = client.bucket(bucket_name)
blob = bucket.blob(blob_name)
blob.download_to_filename(local_path)
print(f"Downloaded to {local_path}")
# Usage
download_video(
"gs://YOUR_BUCKET/output/generated_video.mp4",
"/local/path/video.mp4"
)
Batch Processing
For generating multiple videos efficiently:
import asyncio
from concurrent.futures import ThreadPoolExecutor
prompts = [
"Mountain sunrise, slow aerial drift, golden tones",
"Ocean waves in slow motion, emerald water",
"City street at night, neon reflections, rain",
"Forest path in autumn, leaves falling, dappled light",
"Product on white background, slow rotation, studio lighting"
]
def generate_single(prompt_and_index):
index, prompt = prompt_and_index
output_uri = f"gs://YOUR_BUCKET/batch/video_{index:03d}.mp4"
return generate_video_with_retry(prompt, output_uri)
# Process in parallel (respect rate limits)
with ThreadPoolExecutor(max_workers=3) as executor:
results = list(executor.map(generate_single, enumerate(prompts)))
print(f"Generated {len([r for r in results if r])} videos successfully")
Rate Limits and Quotas
Veo 3 API has rate limits that vary by project tier:
| Tier | Requests/minute | Concurrent operations |
|---|---|---|
| Default | 5 | 3 |
| Elevated | 20 | 10 |
| Enterprise | Custom | Custom |
To request elevated quotas, submit a quota increase request through the Google Cloud Console.
Cost Estimation
Veo 3 API pricing is based on video duration generated:
- Standard (720p): ~$0.35 per second of video
- High quality (1080p): ~$0.50 per second of video
For a batch of 100 eight-second clips at 1080p: approximately $400.
For cost-sensitive applications, consider using Seedance's API as an alternative — it offers comparable quality at lower cost with commercial rights included.
Error Handling
Common errors and how to handle them:
from google.api_core import exceptions
try:
operation = model.generate_video(prompt=prompt, output_gcs_uri=output_uri)
response = operation.result(timeout=300)
except exceptions.ResourceExhausted as e:
# Rate limit hit — implement exponential backoff
print(f"Rate limited: {e}")
except exceptions.InvalidArgument as e:
# Bad prompt or parameters
print(f"Invalid request: {e}")
except exceptions.PermissionDenied as e:
# Authentication or authorization issue
print(f"Permission denied: {e}")
except TimeoutError:
# Generation took too long
print("Generation timed out — check operation status manually")
Production Best Practices
Prompt Engineering for API Use
When using the API programmatically, prompt quality directly affects output quality:
def build_prompt(subject: str, action: str, environment: str,
camera: str = "cinematic", style: str = "photorealistic") -> str:
return f"{subject} {action}, {environment}, {camera}, {style}"
# Examples
prompts = [
build_prompt("luxury watch", "slow 360-degree rotation", "black marble surface, soft studio lighting", "close-up, shallow depth of field", "commercial photography quality"),
build_prompt("young professional", "walking confidently", "modern glass office building lobby", "slow dolly forward", "corporate, natural lighting"),
]
Output Management
import os
from datetime import datetime
def get_output_uri(project_id: str, bucket: str, job_name: str) -> str:
timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
return f"gs://{bucket}/veo3/{job_name}/{timestamp}/"
Monitoring and Logging
import logging
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger("veo3_api")
def generate_with_logging(prompt: str, output_uri: str):
logger.info(f"Starting generation: {prompt[:50]}...")
start_time = time.time()
result = generate_video_with_retry(prompt, output_uri)
elapsed = time.time() - start_time
logger.info(f"Generation complete in {elapsed:.1f}s: {result}")
return result
Alternative: Seedance API
For applications where cost, access restrictions, or commercial rights are concerns, Seedance offers an API with:
- Lower cost per generation
- No access restrictions or allowlist requirements
- Commercial rights included
- Comparable output quality for most use cases
The Seedance API follows a similar request/response pattern, making it straightforward to switch between providers or use both in the same application.
Learn more about Seedance at seedance.tv →
Cost Management
Veo 3 API pricing is based on video seconds generated. Understanding the cost structure helps you build cost-effective applications.
Pricing Structure
- Standard resolution (720p): Charged per second of video generated
- High resolution (1080p): Higher per-second rate
- Failed generations: Not charged if the generation fails before producing output
Cost Optimization Strategies
1. Use lower resolution for testing
During development and testing, use 720p to reduce costs. Switch to 1080p only for production output.
# Development/testing
resolution = "720p"
# Production
resolution = "1080p"
2. Implement prompt validation before generation
Validate prompts before sending to the API to avoid wasting credits on prompts that will fail content policy checks:
def validate_prompt(prompt: str) -> bool:
# Check minimum length
if len(prompt.strip()) < 10:
return False
# Check maximum length
if len(prompt) > 1000:
return False
# Add your own validation logic
return True
3. Cache generated videos
Store generated videos in GCS and implement a caching layer to avoid regenerating the same content:
import hashlib
def get_cache_key(prompt: str, params: dict) -> str:
content = f"{prompt}:{sorted(params.items())}"
return hashlib.md5(content.encode()).hexdigest()
def get_or_generate(prompt: str, params: dict) -> str:
cache_key = get_cache_key(prompt, params)
cached_uri = f"gs://YOUR_BUCKET/cache/{cache_key}.mp4"
# Check if cached version exists
if gcs_file_exists(cached_uri):
return cached_uri
# Generate and cache
result = generate_video(prompt, cached_uri, params)
return result
4. Use batch processing for large volumes
For large-scale generation, batch requests to maximize throughput while staying within quota limits:
import asyncio
from concurrent.futures import ThreadPoolExecutor
async def batch_generate(prompts: list, max_concurrent: int = 5):
semaphore = asyncio.Semaphore(max_concurrent)
async def generate_one(prompt):
async with semaphore:
return await asyncio.to_thread(generate_video, prompt)
tasks = [generate_one(p) for p in prompts]
return await asyncio.gather(*tasks, return_exceptions=True)
Error Handling
Robust error handling is essential for production Veo 3 API integrations:
from google.api_core import exceptions as google_exceptions
def handle_veo3_error(error):
if isinstance(error, google_exceptions.ResourceExhausted):
# Quota exceeded — implement exponential backoff
print("Quota exceeded. Waiting before retry...")
time.sleep(60)
elif isinstance(error, google_exceptions.InvalidArgument):
# Bad request — check prompt and parameters
print(f"Invalid request: {error.message}")
raise # Don't retry invalid requests
elif isinstance(error, google_exceptions.PermissionDenied):
# Authentication issue
print("Permission denied. Check credentials and IAM roles.")
raise
elif isinstance(error, google_exceptions.NotFound):
# Model not available in region
print("Model not found. Check region and model availability.")
raise
else:
# Unknown error — log and retry
print(f"Unknown error: {error}")
raise
Quota Management
Veo 3 API has quota limits that vary by project and region. Key quotas to monitor:
- Requests per minute: Limits concurrent generation requests
- Video seconds per day: Total video output per day
- Storage: GCS storage for generated videos
Monitor your quota usage in the Google Cloud Console under APIs & Services → Quotas.
To request quota increases, submit a request through the Cloud Console. For high-volume use cases, contact Google Cloud sales for enterprise quota arrangements.
Webhooks and Notifications
For production applications, polling for completion is inefficient. Use Pub/Sub notifications instead:
from google.cloud import pubsub_v1
# Set up Pub/Sub topic for notifications
publisher = pubsub_v1.PublisherClient()
topic_path = publisher.topic_path("YOUR_PROJECT_ID", "veo3-completions")
# Configure generation with notification
operation = model.generate_video(
prompt=prompt,
output_gcs_uri=output_uri,
# Pub/Sub notification when complete
notification_config={
"pubsub_topic": topic_path
}
)
Your subscriber receives a message when generation completes, eliminating the need for polling.
Veo 3 API vs Seedance API
For developers evaluating AI video APIs, here's a comparison:
| Aspect | Veo 3 API | Seedance API |
|---|---|---|
| Access | Restricted (allowlist) | Open |
| Pricing | Per-second billing | Subscription-based |
| Output quality | Highest | Excellent |
| Latency | 30-120 seconds | 15-45 seconds |
| Max duration | 8 seconds | 8 seconds |
| Commercial rights | Yes (with license) | Yes (free tier) |
| Documentation | Comprehensive | Good |
| SDK support | Python, Java, Go, Node.js | REST API |
For applications where Veo 3's quality is essential and access is available, the Veo 3 API is the premium choice. For applications that need open access, faster iteration, and lower cost, Seedance's API is a strong alternative.
Explore Seedance as an accessible alternative at seedance.tv →
Frequently Asked Questions
Is the Veo 3 API publicly available? The Veo 3 API is available through Google Cloud Vertex AI, but access may require allowlist approval depending on your region and use case. Apply through the Google Cloud Console.
What programming languages are supported? Google Cloud provides SDKs for Python, Java, Go, Node.js, and .NET. The REST API works with any language that can make HTTP requests.
How long does video generation take? Typically 30-120 seconds depending on server load, video length, and resolution. Implement appropriate timeouts in your application.
Can I generate videos longer than 8 seconds? The current maximum is 8 seconds per generation. For longer videos, generate multiple clips and stitch them together in post-processing.
What happens if generation fails? Failed generations are not charged. Implement retry logic with exponential backoff for transient failures.
Is there a free tier for the Veo 3 API? Google Cloud offers $300 in free credits for new accounts, which can be used for Veo 3 API calls. There is no ongoing free tier — all usage is billed after credits are exhausted.
What are the content policy restrictions? Veo 3 follows Google's content policies. Prompts that violate these policies will be rejected. Review the Vertex AI content policy documentation for details.
Can I use generated videos commercially? Yes, with appropriate licensing. Review the Google Cloud terms of service and Vertex AI usage policies for commercial use requirements.
The Veo 3 API is a powerful tool for developers building AI video applications. The combination of Google's model quality, comprehensive SDK support, and enterprise-grade infrastructure makes it a strong choice for production applications where quality is the top priority.
For developers who need open access without allowlist requirements, Seedance provides a capable alternative with a straightforward API and free commercial rights.
Start building with Seedance's open API at seedance.tv →
Getting Started Checklist
Before deploying a Veo 3 API integration to production, verify:
- [ ] Google Cloud project created with billing enabled
- [ ] Vertex AI API enabled
- [ ] Service account created with appropriate IAM roles
- [ ] Veo 3 model access confirmed (allowlist if required)
- [ ] GCS bucket created for output storage
- [ ] Error handling implemented for all API error types
- [ ] Retry logic with exponential backoff implemented
- [ ] Quota monitoring configured
- [ ] Cost alerts set up in Cloud Billing
- [ ] Content policy review completed
- [ ] Commercial licensing requirements reviewed
With these items checked, your Veo 3 API integration is ready for production use.
Start with the Python SDK examples in this guide, test with a small number of generations, then scale up as you validate your integration.
Node.js Integration Example
For teams using Node.js, here's a complete integration example:
const { VertexAI } = require('@google-cloud/vertexai');
const { Storage } = require('@google-cloud/storage');
const vertexAI = new VertexAI({
project: process.env.GOOGLE_CLOUD_PROJECT,
location: 'us-central1'
});
const storage = new Storage();
async function generateVideo(prompt, outputBucket, outputPath) {
const model = vertexAI.preview.getGenerativeModel({
model: 'veo-3.0-generate-preview'
});
const request = {
contents: [{
role: 'user',
parts: [{ text: prompt }]
}],
generationConfig: {
duration: 8,
aspectRatio: '16:9',
resolution: '1080p',
outputGcsUri: `gs://${outputBucket}/${outputPath}`
}
};
// Submit long-running operation
const [operation] = await model.generateVideoLongRunning(request);
// Poll for completion
const [response] = await operation.promise();
const videoUri = response.candidates[0].content.parts[0].fileData.fileUri;
console.log(`Video generated: ${videoUri}`);
return videoUri;
}
// Usage
generateVideo(
'Aerial view of a coastal city at sunset, slow drone descent, cinematic',
'my-video-bucket',
'output/coastal-sunset.mp4'
).catch(console.error);
Monitoring and Observability
For production deployments, implement monitoring to track API performance and costs:
import time
import logging
from dataclasses import dataclass
from typing import Optional
@dataclass
class GenerationMetrics:
prompt: str
start_time: float
end_time: Optional[float] = None
success: bool = False
error: Optional[str] = None
output_uri: Optional[str] = None
@property
def duration_seconds(self):
if self.end_time:
return self.end_time - self.start_time
return None
def generate_with_metrics(prompt: str, output_uri: str) -> GenerationMetrics:
metrics = GenerationMetrics(prompt=prompt, start_time=time.time())
try:
result = generate_video(prompt, output_uri)
metrics.success = True
metrics.output_uri = result
metrics.end_time = time.time()
logging.info(f"Generation successful: {metrics.duration_seconds:.1f}s")
except Exception as e:
metrics.success = False
metrics.error = str(e)
metrics.end_time = time.time()
logging.error(f"Generation failed after {metrics.duration_seconds:.1f}s: {e}")
return metrics
Log these metrics to Cloud Monitoring or your preferred observability platform to track generation success rates, latency trends, and cost per generation over time.
Regional Availability
Veo 3 is available in select Google Cloud regions. As of 2026, the primary supported regions are:
- us-central1 (Iowa): Primary region with full feature support
- us-east4 (Northern Virginia): Available for US-based applications
- europe-west4 (Netherlands): Available for European applications
- asia-northeast1 (Tokyo): Available for Asia-Pacific applications
For the most current list of supported regions, check the Vertex AI documentation. If your preferred region is not supported, you can generate in a supported region and transfer the output to your preferred storage location.
Regional selection affects latency — choose the region closest to your application servers for the best performance. For global applications, consider generating in multiple regions and routing requests to the nearest available region.
Security Best Practices
When integrating the Veo 3 API in production, follow these security best practices:
Never hardcode credentials: Use environment variables or secret management services (Google Secret Manager, HashiCorp Vault) to store API credentials.
Use service accounts with minimal permissions: Grant only the permissions required for video generation. The Vertex AI User role is sufficient for most use cases.
Rotate credentials regularly: Implement a credential rotation schedule and automate the rotation process where possible.
Validate and sanitize prompts: If your application accepts user-provided prompts, validate and sanitize them before sending to the API to prevent prompt injection attacks.
Implement rate limiting: Protect your application from abuse by implementing rate limiting on the endpoints that trigger video generation.
Audit logging: Enable Cloud Audit Logs to track all API calls for security and compliance purposes.
Following these practices ensures your Veo 3 integration is secure and compliant with enterprise security requirements.
Explore Seedance as an accessible alternative at seedance.tv →
Related Articles
Continue with more blog posts in the same locale.

Veo 3 for Beginners: Complete Getting Started Guide 2026
Master Veo 3 AI video generation with our complete beginner guide. Learn step-by-step how to create your first videos, write better prompts, and avoid common mistakes.
Read article
Veo 3 Text to Video: Complete Guide to Google AI Video Generation (2026)
Comprehensive guide to using Veo 3 for text-to-video generation. Covers access, prompting framework, comparisons with Runway and Kling, limitations, and workflow optimization.
Read article
Veo 3 for Marketing Teams: Create AI Video Ads That Convert
Discover how marketing teams use Veo 3 to create high-converting video ads 10x faster. Complete guide with ROI analysis, A/B testing strategies, and real use cases.
Read article