Google VertexAI — Setup & Integration
Enterprise VertexAI setup with India region and pricing
Google VertexAI is one of the three major enterprise AI platforms alongside AWS Bedrock and Azure AI. For Indian enterprises, VertexAI offers a compelling combination: Gemini models (the most cost-effective frontier LLMs), Model Garden with 200+ open-source models, strong India region availability, and tight integration with BigQuery for data-heavy AI workloads.
This guide walks through enterprise-grade setup from GCP project creation to production-ready AI integration.
What You'll Learn
- GCP project setup and VertexAI API enablement
- Authentication: service accounts and workload identity
- Available models: Gemini, open-source via Model Garden
- Python SDK quickstart with working code examples
- Grounding with Google Search and private data
- India region (asia-south1) configuration for data residency
- Pricing breakdown in ₹
- Enterprise security: VPC-SC, CMEK, audit logging
- Integration patterns: REST API, gRPC, LangChain
GCP Project Setup
Step 1: Create a GCP Project
# Install gcloud CLI (if not already installed)
curl https://sdk.cloud.google.com | bash
# Authenticate
gcloud auth login
# Create project
gcloud projects create my-ai-project-2026 --name="Enterprise AI"
# Set as active project
gcloud config set project my-ai-project-2026
# Link billing account (required for VertexAI)
gcloud billing accounts list
gcloud billing projects link my-ai-project-2026 --billing-account=BILLING_ACCOUNT_ID
Step 2: Enable VertexAI API
# Enable Vertex AI API
gcloud services enable aiplatform.googleapis.com
# Enable additional APIs for full functionality
gcloud services enable compute.googleapis.com
gcloud services enable storage.googleapis.com
gcloud services enable bigquery.googleapis.com
Step 3: Set India Region
# Set default region to Mumbai
gcloud config set compute/region asia-south1
# Verify available Gemini models in asia-south1
gcloud ai models list --region=asia-south1
India Data Residency: Always use
asia-south1(Mumbai) orasia-south2(Delhi) for DPDP Act compliance. Some newer models may launch inus-central1first — check regional availability before production deployment.
Authentication
Service Accounts (Recommended for Production)
# Create a service account
gcloud iam service-accounts create vertexai-production \
--display-name="VertexAI Production Service Account"
# Grant minimal required roles
gcloud projects add-iam-policy-binding my-ai-project-2026 \
--member="serviceAccount:[email protected]" \
--role="roles/aiplatform.user"
# Generate key file (for non-GCP environments)
gcloud iam service-accounts keys create key.json \
--iam-account=vertexai-production@my-ai-project-2026.iam.gserviceaccount.com
IAM Roles for VertexAI:
| Role | Purpose | Use Case |
|------|---------|----------|
| roles/aiplatform.user | Invoke models, run predictions | Application service accounts |
| roles/aiplatform.admin | Full VertexAI management | CoE administrators |
| roles/aiplatform.viewer | Read-only access | Monitoring and auditing |
| roles/aiplatform.modelUser | Custom model access | Fine-tuned model deployment |
Workload Identity (GKE)
For applications running on Google Kubernetes Engine, use Workload Identity instead of service account keys:
# Enable Workload Identity on GKE cluster
gcloud container clusters update my-cluster \
--region=asia-south1 \
--workload-pool=my-ai-project-2026.svc.id.goog
This eliminates the need to manage and rotate service account key files.
Model Access
Gemini Models
| Model | Best For | Context Window | India Region | |-------|---------|---------------|:------------:| | Gemini 2.5 Pro | Complex reasoning, code, analysis | 1M tokens | Yes | | Gemini 2.5 Flash | High-volume, cost-sensitive tasks | 1M tokens | Yes | | Gemini 2.0 Flash | Fast inference, simple tasks | 1M tokens | Yes |
Model Garden (200+ Models)
VertexAI Model Garden provides access to open-source and third-party models:
- Meta Llama 3.1/4: Open-source, strong multilingual
- Mistral/Mixtral: Efficient, code-capable
- Claude (Anthropic): Available through VertexAI partnership
- Stable Diffusion: Image generation
- Whisper: Speech-to-text
- Specialized models: Medical, legal, financial domain models
Python SDK Quickstart
Installation
pip install google-cloud-aiplatform
Basic Text Generation
import vertexai
from vertexai.generative_models import GenerativeModel
# Initialize with India region
vertexai.init(project="my-ai-project-2026", location="asia-south1")
# Load Gemini model
model = GenerativeModel("gemini-2.5-pro")
# Generate response
response = model.generate_content(
"Explain the key provisions of India's DPDP Act 2023 for AI systems"
)
print(response.text)
Streaming Response
# Stream for real-time applications (chatbots, live analysis)
response = model.generate_content(
"Write a compliance checklist for healthcare AI in India",
stream=True
)
for chunk in response:
print(chunk.text, end="", flush=True)
Multi-Turn Conversation
chat = model.start_chat()
response1 = chat.send_message("What is SOC2 compliance?")
print(response1.text)
response2 = chat.send_message("How does it apply to AI systems specifically?")
print(response2.text)
# Chat maintains context across turns
response3 = chat.send_message("Give me a checklist for my SaaS product")
print(response3.text)
Structured Output (JSON)
from vertexai.generative_models import GenerationConfig
response = model.generate_content(
"Analyze these 3 Indian AI startups: Krutrim, Sarvam AI, Ola Maps AI. "
"Return a JSON array with name, focus_area, funding_status, and strength.",
generation_config=GenerationConfig(
response_mime_type="application/json"
)
)
import json
startups = json.loads(response.text)
Grounding with Google Search
Grounding connects Gemini to real-time information, reducing hallucinations and providing citations:
from vertexai.generative_models import GenerativeModel, Tool
from vertexai.preview.generative_models import grounding
model = GenerativeModel("gemini-2.5-pro")
# Ground with Google Search
response = model.generate_content(
"What are the latest RBI guidelines on AI usage in Indian banking?",
tools=[Tool.from_google_search_retrieval(
grounding.GoogleSearchRetrieval()
)]
)
print(response.text)
# Response includes citations with source URLs
Grounding with Private Data
Connect Gemini to your enterprise data stored in Cloud Storage or BigQuery:
from vertexai.preview import rag
# Create a RAG corpus from Cloud Storage
corpus = rag.create_corpus(
display_name="Company Policies",
description="Internal compliance and AI usage policies"
)
# Import documents
rag.import_files(
corpus_name=corpus.name,
paths=["gs://my-bucket/policies/"],
chunk_size=512,
chunk_overlap=50
)
# Query with grounding
response = rag.retrieval_query(
rag_resources=[rag.RagResource(rag_corpus=corpus.name)],
text="What is our company policy on sending customer data to external AI APIs?",
model="gemini-2.5-pro"
)
India Pricing in ₹
Prices as of March 2026, converted at approximately ₹84/USD.
| Model | Input (₹/1M tokens) | Output (₹/1M tokens) | Context Window | |-------|--------------------:|---------------------:|---------------| | Gemini 2.5 Pro | ₹105 | ₹420 | 1M tokens | | Gemini 2.5 Flash | ₹6.3 | ₹25.2 | 1M tokens | | Gemini 2.0 Flash | ₹5.0 | ₹21.0 | 1M tokens |
Monthly Cost Estimates:
| Use Case | Volume | Model | Monthly Cost | |----------|--------|-------|------------:| | Customer support chatbot | 5M tokens/day | Flash | ~₹4,700/month | | Document analysis | 20M tokens/day | Pro | ~₹1,57,500/month | | Code review assistant | 10M tokens/day | Pro | ~₹78,750/month | | Bulk data classification | 50M tokens/day | Flash | ~₹47,250/month |
Cost Tip: Use Gemini Flash for high-volume, straightforward tasks and reserve Pro for complex reasoning. A typical enterprise saves 60-80% by routing appropriately between models.
Enterprise Security Features
VPC Service Controls (VPC-SC)
Isolate VertexAI within your VPC to prevent data exfiltration:
# Create a VPC-SC perimeter
gcloud access-context-manager perimeters create ai-perimeter \
--title="AI Services Perimeter" \
--resources="projects/my-ai-project-2026" \
--restricted-services="aiplatform.googleapis.com" \
--access-levels="accessPolicies/POLICY_ID/accessLevels/corp-network"
This ensures VertexAI API calls can only originate from your corporate network.
Customer-Managed Encryption Keys (CMEK)
Encrypt all VertexAI data with your own Cloud KMS keys:
# Create encryption key
gcloud kms keys create vertexai-key \
--keyring=ai-keyring \
--location=asia-south1 \
--purpose=encryption
Audit Logging
Enable Data Access audit logs for all VertexAI operations:
# Enable audit logging
gcloud projects get-iam-policy my-ai-project-2026 --format=json > policy.json
# Add audit log config for aiplatform.googleapis.com
All model invocations, configuration changes, and data access events are logged to Cloud Logging, which can be exported to BigQuery for analysis or to SIEM systems for security monitoring.
Integration Patterns
REST API
For applications in any language:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
"https://asia-south1-aiplatform.googleapis.com/v1/projects/my-ai-project-2026/locations/asia-south1/publishers/google/models/gemini-2.5-pro:generateContent" \
-d '{
"contents": [{"role": "user", "parts": [{"text": "Hello from India!"}]}]
}'
LangChain Integration
from langchain_google_vertexai import ChatVertexAI
llm = ChatVertexAI(
model="gemini-2.5-pro",
project="my-ai-project-2026",
location="asia-south1",
temperature=0.1,
max_tokens=2048
)
response = llm.invoke("Summarize DPDP Act compliance requirements")
LlamaIndex Integration
from llama_index.llms.vertex import Vertex
llm = Vertex(model="gemini-2.5-pro", project="my-ai-project-2026")
Official Resources
- VertexAI Documentation — Complete API reference and tutorials
- VertexAI Pricing — Current pricing (convert to ₹)
- Model Garden Catalog — Browse all available models
- VertexAI Security Best Practices — Enterprise security guide
- GCP India Regions — Mumbai and Delhi region details
Next Steps
- Compare VertexAI with AWS Bedrock and Azure AI before committing
- Implement security guardrails on top of VertexAI safety settings
- Ensure your setup meets Indian compliance requirements
- Set up secure prompting practices for regulated industry use cases
- Build a prompt library optimized for Gemini models
Community Questions
0No questions yet. Be the first to ask!