How do I set up Google VertexAI for an Indian enterprise?

Create a GCP project, enable the Vertex AI API, set the region to asia-south1 (Mumbai) for data residency, create a service account with appropriate IAM roles, install the Python SDK, and start making API calls to Gemini models.

What is VertexAI pricing in Indian Rupees for Gemini models?

As of March 2026, Gemini 2.5 Pro costs approximately ₹105/1M input tokens and ₹420/1M output tokens. Gemini 2.5 Flash costs ₹6.3/1M input tokens and ₹25.2/1M output tokens. Flash is the most cost-effective frontier model available in India.

Is Google VertexAI available in India with data residency?

Yes. VertexAI is available in asia-south1 (Mumbai) and asia-south2 (Delhi). You can enforce data residency using VPC Service Controls to ensure all AI processing and data storage stays within Indian regions.

What is VertexAI Model Garden and how is it different from direct Gemini API?

Model Garden provides access to 200+ models — including Gemini, Llama, Mistral, Stable Diffusion, and specialized models — all through VertexAI's unified API. Direct Gemini API only gives you Gemini models. Model Garden lets you test and deploy multiple models without separate accounts.

How do I use VertexAI grounding with Google Search or private data?

VertexAI grounding connects Gemini to real-time Google Search results or your private data in Cloud Storage or BigQuery. Enable grounding in the generate_content call to reduce hallucinations and provide citations with source URLs.

Google VertexAI — Setup & Integration — India Guide 2026

Google VertexAI — Setup & Integration

Enterprise VertexAI setup with India region and pricing

Google VertexAI is one of the three major enterprise AI platforms alongside AWS Bedrock and Azure AI. For Indian enterprises, VertexAI offers a compelling combination: Gemini models (the most cost-effective frontier LLMs), Model Garden with 200+ open-source models, strong India region availability, and tight integration with BigQuery for data-heavy AI workloads.

This guide walks through enterprise-grade setup from GCP project creation to production-ready AI integration.

What You'll Learn

GCP project setup and VertexAI API enablement
Authentication: service accounts and workload identity
Available models: Gemini, open-source via Model Garden
Python SDK quickstart with working code examples
Grounding with Google Search and private data
India region (asia-south1) configuration for data residency
Pricing breakdown in ₹
Enterprise security: VPC-SC, CMEK, audit logging
Integration patterns: REST API, gRPC, LangChain

GCP Project Setup

Step 1: Create a GCP Project

# Install gcloud CLI (if not already installed)
curl https://sdk.cloud.google.com | bash

# Authenticate
gcloud auth login

# Create project
gcloud projects create my-ai-project-2026 --name="Enterprise AI"

# Set as active project
gcloud config set project my-ai-project-2026

# Link billing account (required for VertexAI)
gcloud billing accounts list
gcloud billing projects link my-ai-project-2026 --billing-account=BILLING_ACCOUNT_ID

Step 2: Enable VertexAI API

# Enable Vertex AI API
gcloud services enable aiplatform.googleapis.com

# Enable additional APIs for full functionality
gcloud services enable compute.googleapis.com
gcloud services enable storage.googleapis.com
gcloud services enable bigquery.googleapis.com

Step 3: Set India Region

# Set default region to Mumbai
gcloud config set compute/region asia-south1

# Verify available Gemini models in asia-south1
gcloud ai models list --region=asia-south1

India Data Residency: Always use asia-south1 (Mumbai) or asia-south2 (Delhi) for DPDP Act compliance. Some newer models may launch in us-central1 first — check regional availability before production deployment.

Authentication

Service Accounts (Recommended for Production)

# Create a service account
gcloud iam service-accounts create vertexai-production \
    --display-name="VertexAI Production Service Account"

# Grant minimal required roles
gcloud projects add-iam-policy-binding my-ai-project-2026 \
    --member="serviceAccount:[email protected]" \
    --role="roles/aiplatform.user"

# Generate key file (for non-GCP environments)
gcloud iam service-accounts keys create key.json \
    --iam-account=vertexai-production@my-ai-project-2026.iam.gserviceaccount.com

IAM Roles for VertexAI:

| Role | Purpose | Use Case | |------|---------|----------| | roles/aiplatform.user | Invoke models, run predictions | Application service accounts | | roles/aiplatform.admin | Full VertexAI management | CoE administrators | | roles/aiplatform.viewer | Read-only access | Monitoring and auditing | | roles/aiplatform.modelUser | Custom model access | Fine-tuned model deployment |

Workload Identity (GKE)

For applications running on Google Kubernetes Engine, use Workload Identity instead of service account keys:

# Enable Workload Identity on GKE cluster
gcloud container clusters update my-cluster \
    --region=asia-south1 \
    --workload-pool=my-ai-project-2026.svc.id.goog

This eliminates the need to manage and rotate service account key files.

Model Access

Gemini Models

| Model | Best For | Context Window | India Region | |-------|---------|---------------|:------------:| | Gemini 2.5 Pro | Complex reasoning, code, analysis | 1M tokens | Yes | | Gemini 2.5 Flash | High-volume, cost-sensitive tasks | 1M tokens | Yes | | Gemini 2.0 Flash | Fast inference, simple tasks | 1M tokens | Yes |

Model Garden (200+ Models)

VertexAI Model Garden provides access to open-source and third-party models:

Meta Llama 3.1/4: Open-source, strong multilingual
Mistral/Mixtral: Efficient, code-capable
Claude (Anthropic): Available through VertexAI partnership
Stable Diffusion: Image generation
Whisper: Speech-to-text
Specialized models: Medical, legal, financial domain models

Python SDK Quickstart

Installation

pip install google-cloud-aiplatform

Basic Text Generation

import vertexai
from vertexai.generative_models import GenerativeModel

# Initialize with India region
vertexai.init(project="my-ai-project-2026", location="asia-south1")

# Load Gemini model
model = GenerativeModel("gemini-2.5-pro")

# Generate response
response = model.generate_content(
    "Explain the key provisions of India's DPDP Act 2023 for AI systems"
)

print(response.text)

Streaming Response

# Stream for real-time applications (chatbots, live analysis)
response = model.generate_content(
    "Write a compliance checklist for healthcare AI in India",
    stream=True
)

for chunk in response:
    print(chunk.text, end="", flush=True)

Multi-Turn Conversation

chat = model.start_chat()

response1 = chat.send_message("What is SOC2 compliance?")
print(response1.text)

response2 = chat.send_message("How does it apply to AI systems specifically?")
print(response2.text)

# Chat maintains context across turns
response3 = chat.send_message("Give me a checklist for my SaaS product")
print(response3.text)

Structured Output (JSON)

from vertexai.generative_models import GenerationConfig

response = model.generate_content(
    "Analyze these 3 Indian AI startups: Krutrim, Sarvam AI, Ola Maps AI. "
    "Return a JSON array with name, focus_area, funding_status, and strength.",
    generation_config=GenerationConfig(
        response_mime_type="application/json"
    )
)

import json
startups = json.loads(response.text)

Grounding with Google Search

Grounding connects Gemini to real-time information, reducing hallucinations and providing citations:

from vertexai.generative_models import GenerativeModel, Tool
from vertexai.preview.generative_models import grounding

model = GenerativeModel("gemini-2.5-pro")

# Ground with Google Search
response = model.generate_content(
    "What are the latest RBI guidelines on AI usage in Indian banking?",
    tools=[Tool.from_google_search_retrieval(
        grounding.GoogleSearchRetrieval()
    )]
)

print(response.text)
# Response includes citations with source URLs

Grounding with Private Data

Connect Gemini to your enterprise data stored in Cloud Storage or BigQuery:

from vertexai.preview import rag

# Create a RAG corpus from Cloud Storage
corpus = rag.create_corpus(
    display_name="Company Policies",
    description="Internal compliance and AI usage policies"
)

# Import documents
rag.import_files(
    corpus_name=corpus.name,
    paths=["gs://my-bucket/policies/"],
    chunk_size=512,
    chunk_overlap=50
)

# Query with grounding
response = rag.retrieval_query(
    rag_resources=[rag.RagResource(rag_corpus=corpus.name)],
    text="What is our company policy on sending customer data to external AI APIs?",
    model="gemini-2.5-pro"
)

India Pricing in ₹

Prices as of March 2026, converted at approximately ₹84/USD.

| Model | Input (₹/1M tokens) | Output (₹/1M tokens) | Context Window | |-------|--------------------:|---------------------:|---------------| | Gemini 2.5 Pro | ₹105 | ₹420 | 1M tokens | | Gemini 2.5 Flash | ₹6.3 | ₹25.2 | 1M tokens | | Gemini 2.0 Flash | ₹5.0 | ₹21.0 | 1M tokens |

Monthly Cost Estimates:

| Use Case | Volume | Model | Monthly Cost | |----------|--------|-------|------------:| | Customer support chatbot | 5M tokens/day | Flash | ~₹4,700/month | | Document analysis | 20M tokens/day | Pro | ~₹1,57,500/month | | Code review assistant | 10M tokens/day | Pro | ~₹78,750/month | | Bulk data classification | 50M tokens/day | Flash | ~₹47,250/month |

Cost Tip: Use Gemini Flash for high-volume, straightforward tasks and reserve Pro for complex reasoning. A typical enterprise saves 60-80% by routing appropriately between models.

Enterprise Security Features

VPC Service Controls (VPC-SC)

Isolate VertexAI within your VPC to prevent data exfiltration:

# Create a VPC-SC perimeter
gcloud access-context-manager perimeters create ai-perimeter \
    --title="AI Services Perimeter" \
    --resources="projects/my-ai-project-2026" \
    --restricted-services="aiplatform.googleapis.com" \
    --access-levels="accessPolicies/POLICY_ID/accessLevels/corp-network"

This ensures VertexAI API calls can only originate from your corporate network.

Customer-Managed Encryption Keys (CMEK)

Encrypt all VertexAI data with your own Cloud KMS keys:

# Create encryption key
gcloud kms keys create vertexai-key \
    --keyring=ai-keyring \
    --location=asia-south1 \
    --purpose=encryption

Audit Logging

Enable Data Access audit logs for all VertexAI operations:

# Enable audit logging
gcloud projects get-iam-policy my-ai-project-2026 --format=json > policy.json
# Add audit log config for aiplatform.googleapis.com

All model invocations, configuration changes, and data access events are logged to Cloud Logging, which can be exported to BigQuery for analysis or to SIEM systems for security monitoring.

Integration Patterns

REST API

For applications in any language:

curl -X POST \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  -H "Content-Type: application/json" \
  "https://asia-south1-aiplatform.googleapis.com/v1/projects/my-ai-project-2026/locations/asia-south1/publishers/google/models/gemini-2.5-pro:generateContent" \
  -d '{
    "contents": [{"role": "user", "parts": [{"text": "Hello from India!"}]}]
  }'

LangChain Integration

from langchain_google_vertexai import ChatVertexAI

llm = ChatVertexAI(
    model="gemini-2.5-pro",
    project="my-ai-project-2026",
    location="asia-south1",
    temperature=0.1,
    max_tokens=2048
)

response = llm.invoke("Summarize DPDP Act compliance requirements")

LlamaIndex Integration

from llama_index.llms.vertex import Vertex

llm = Vertex(model="gemini-2.5-pro", project="my-ai-project-2026")

Official Resources

VertexAI Documentation — Complete API reference and tutorials
VertexAI Pricing — Current pricing (convert to ₹)
Model Garden Catalog — Browse all available models
VertexAI Security Best Practices — Enterprise security guide
GCP India Regions — Mumbai and Delhi region details

Next Steps

Compare VertexAI with AWS Bedrock and Azure AI before committing
Implement security guardrails on top of VertexAI safety settings
Ensure your setup meets Indian compliance requirements
Set up secure prompting practices for regulated industry use cases
Build a prompt library optimized for Gemini models

Community Questions

No questions yet. Be the first to ask!

Share this guide

r/developersIndia r/india r/ChatGPT

Google VertexAI — Setup & Integration

Enterprise VertexAI setup with India region and pricing

This guide walks through enterprise-grade setup from GCP project creation to production-ready AI integration.

What You'll Learn

GCP project setup and VertexAI API enablement
Authentication: service accounts and workload identity
Available models: Gemini, open-source via Model Garden
Python SDK quickstart with working code examples
Grounding with Google Search and private data
India region (asia-south1) configuration for data residency
Pricing breakdown in ₹
Enterprise security: VPC-SC, CMEK, audit logging
Integration patterns: REST API, gRPC, LangChain

GCP Project Setup

Step 1: Create a GCP Project

# Install gcloud CLI (if not already installed)
curl https://sdk.cloud.google.com | bash

# Authenticate
gcloud auth login

# Create project
gcloud projects create my-ai-project-2026 --name="Enterprise AI"

# Set as active project
gcloud config set project my-ai-project-2026

# Link billing account (required for VertexAI)
gcloud billing accounts list
gcloud billing projects link my-ai-project-2026 --billing-account=BILLING_ACCOUNT_ID

Step 2: Enable VertexAI API

# Enable Vertex AI API
gcloud services enable aiplatform.googleapis.com

# Enable additional APIs for full functionality
gcloud services enable compute.googleapis.com
gcloud services enable storage.googleapis.com
gcloud services enable bigquery.googleapis.com

Step 3: Set India Region

# Set default region to Mumbai
gcloud config set compute/region asia-south1

# Verify available Gemini models in asia-south1
gcloud ai models list --region=asia-south1

India Data Residency: Always use asia-south1 (Mumbai) or asia-south2 (Delhi) for DPDP Act compliance. Some newer models may launch in us-central1 first — check regional availability before production deployment.

Authentication

Service Accounts (Recommended for Production)

# Create a service account
gcloud iam service-accounts create vertexai-production \
    --display-name="VertexAI Production Service Account"

# Grant minimal required roles
gcloud projects add-iam-policy-binding my-ai-project-2026 \
    --member="serviceAccount:[email protected]" \
    --role="roles/aiplatform.user"

# Generate key file (for non-GCP environments)
gcloud iam service-accounts keys create key.json \
    --iam-account=vertexai-production@my-ai-project-2026.iam.gserviceaccount.com

IAM Roles for VertexAI:

Workload Identity (GKE)

For applications running on Google Kubernetes Engine, use Workload Identity instead of service account keys:

# Enable Workload Identity on GKE cluster
gcloud container clusters update my-cluster \
    --region=asia-south1 \
    --workload-pool=my-ai-project-2026.svc.id.goog

This eliminates the need to manage and rotate service account key files.

Model Access

Gemini Models

Model Garden (200+ Models)

VertexAI Model Garden provides access to open-source and third-party models:

Meta Llama 3.1/4: Open-source, strong multilingual
Mistral/Mixtral: Efficient, code-capable
Claude (Anthropic): Available through VertexAI partnership
Stable Diffusion: Image generation
Whisper: Speech-to-text
Specialized models: Medical, legal, financial domain models

Python SDK Quickstart

Installation

pip install google-cloud-aiplatform

Basic Text Generation

import vertexai
from vertexai.generative_models import GenerativeModel

# Initialize with India region
vertexai.init(project="my-ai-project-2026", location="asia-south1")

# Load Gemini model
model = GenerativeModel("gemini-2.5-pro")

# Generate response
response = model.generate_content(
    "Explain the key provisions of India's DPDP Act 2023 for AI systems"
)

print(response.text)

Streaming Response

# Stream for real-time applications (chatbots, live analysis)
response = model.generate_content(
    "Write a compliance checklist for healthcare AI in India",
    stream=True
)

for chunk in response:
    print(chunk.text, end="", flush=True)

Multi-Turn Conversation

chat = model.start_chat()

response1 = chat.send_message("What is SOC2 compliance?")
print(response1.text)

response2 = chat.send_message("How does it apply to AI systems specifically?")
print(response2.text)

# Chat maintains context across turns
response3 = chat.send_message("Give me a checklist for my SaaS product")
print(response3.text)

Structured Output (JSON)

from vertexai.generative_models import GenerationConfig

response = model.generate_content(
    "Analyze these 3 Indian AI startups: Krutrim, Sarvam AI, Ola Maps AI. "
    "Return a JSON array with name, focus_area, funding_status, and strength.",
    generation_config=GenerationConfig(
        response_mime_type="application/json"
    )
)

import json
startups = json.loads(response.text)

Grounding with Google Search

Grounding connects Gemini to real-time information, reducing hallucinations and providing citations:

from vertexai.generative_models import GenerativeModel, Tool
from vertexai.preview.generative_models import grounding

model = GenerativeModel("gemini-2.5-pro")

# Ground with Google Search
response = model.generate_content(
    "What are the latest RBI guidelines on AI usage in Indian banking?",
    tools=[Tool.from_google_search_retrieval(
        grounding.GoogleSearchRetrieval()
    )]
)

print(response.text)
# Response includes citations with source URLs

Grounding with Private Data

Connect Gemini to your enterprise data stored in Cloud Storage or BigQuery:

from vertexai.preview import rag

# Create a RAG corpus from Cloud Storage
corpus = rag.create_corpus(
    display_name="Company Policies",
    description="Internal compliance and AI usage policies"
)

# Import documents
rag.import_files(
    corpus_name=corpus.name,
    paths=["gs://my-bucket/policies/"],
    chunk_size=512,
    chunk_overlap=50
)

# Query with grounding
response = rag.retrieval_query(
    rag_resources=[rag.RagResource(rag_corpus=corpus.name)],
    text="What is our company policy on sending customer data to external AI APIs?",
    model="gemini-2.5-pro"
)

India Pricing in ₹

Prices as of March 2026, converted at approximately ₹84/USD.

Monthly Cost Estimates:

Cost Tip: Use Gemini Flash for high-volume, straightforward tasks and reserve Pro for complex reasoning. A typical enterprise saves 60-80% by routing appropriately between models.

Enterprise Security Features

VPC Service Controls (VPC-SC)

Isolate VertexAI within your VPC to prevent data exfiltration:

# Create a VPC-SC perimeter
gcloud access-context-manager perimeters create ai-perimeter \
    --title="AI Services Perimeter" \
    --resources="projects/my-ai-project-2026" \
    --restricted-services="aiplatform.googleapis.com" \
    --access-levels="accessPolicies/POLICY_ID/accessLevels/corp-network"

This ensures VertexAI API calls can only originate from your corporate network.

Customer-Managed Encryption Keys (CMEK)

Encrypt all VertexAI data with your own Cloud KMS keys:

# Create encryption key
gcloud kms keys create vertexai-key \
    --keyring=ai-keyring \
    --location=asia-south1 \
    --purpose=encryption

Audit Logging

Enable Data Access audit logs for all VertexAI operations:

# Enable audit logging
gcloud projects get-iam-policy my-ai-project-2026 --format=json > policy.json
# Add audit log config for aiplatform.googleapis.com

All model invocations, configuration changes, and data access events are logged to Cloud Logging, which can be exported to BigQuery for analysis or to SIEM systems for security monitoring.

Integration Patterns

REST API

For applications in any language:

curl -X POST \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  -H "Content-Type: application/json" \
  "https://asia-south1-aiplatform.googleapis.com/v1/projects/my-ai-project-2026/locations/asia-south1/publishers/google/models/gemini-2.5-pro:generateContent" \
  -d '{
    "contents": [{"role": "user", "parts": [{"text": "Hello from India!"}]}]
  }'

LangChain Integration

from langchain_google_vertexai import ChatVertexAI

llm = ChatVertexAI(
    model="gemini-2.5-pro",
    project="my-ai-project-2026",
    location="asia-south1",
    temperature=0.1,
    max_tokens=2048
)

response = llm.invoke("Summarize DPDP Act compliance requirements")

LlamaIndex Integration

from llama_index.llms.vertex import Vertex

llm = Vertex(model="gemini-2.5-pro", project="my-ai-project-2026")

Official Resources

VertexAI Documentation — Complete API reference and tutorials
VertexAI Pricing — Current pricing (convert to ₹)
Model Garden Catalog — Browse all available models
VertexAI Security Best Practices — Enterprise security guide
GCP India Regions — Mumbai and Delhi region details

Next Steps

Compare VertexAI with AWS Bedrock and Azure AI before committing
Implement security guardrails on top of VertexAI safety settings
Ensure your setup meets Indian compliance requirements
Set up secure prompting practices for regulated industry use cases
Build a prompt library optimized for Gemini models

Community Questions

No questions yet. Be the first to ask!

Share this guide

r/developersIndia r/india r/ChatGPT

What You'll Learn

GCP Project Setup

Step 1: Create a GCP Project

Step 2: Enable VertexAI API

Step 3: Set India Region

Authentication

Service Accounts (Recommended for Production)

Workload Identity (GKE)

Model Access

Gemini Models

Model Garden (200+ Models)

Python SDK Quickstart

Installation

Basic Text Generation

Streaming Response

Multi-Turn Conversation

Structured Output (JSON)

Grounding with Google Search

Grounding with Private Data

India Pricing in ₹

Enterprise Security Features

VPC Service Controls (VPC-SC)

Customer-Managed Encryption Keys (CMEK)

Audit Logging

Integration Patterns

REST API

LangChain Integration

LlamaIndex Integration

Official Resources

Next Steps

Community Questions

Share this guide

More guides in Enterprise AI

V.A.U.L.T. — AI Transformation Framework

AI Compliance — HIPAA, PCI-DSS & SOC2

VertexAI vs Bedrock vs Azure AI — Comparison

You Might Also Like

How to Set Up OpenClaw in 30 Minutes

Claude Cowork Setup: Projects, Connectors & Dispatch

Gemini CLI — 1000 Free Requests/Day

What You'll Learn

GCP Project Setup

Step 1: Create a GCP Project

Step 2: Enable VertexAI API

Step 3: Set India Region

Authentication

Service Accounts (Recommended for Production)

Workload Identity (GKE)

Model Access

Gemini Models

Model Garden (200+ Models)

Python SDK Quickstart

Installation

Basic Text Generation

Streaming Response

Multi-Turn Conversation

Structured Output (JSON)

Grounding with Google Search

Grounding with Private Data

India Pricing in ₹

Enterprise Security Features

VPC Service Controls (VPC-SC)

Customer-Managed Encryption Keys (CMEK)

Audit Logging

Integration Patterns

REST API

LangChain Integration

LlamaIndex Integration

Official Resources

Next Steps

Community Questions

Share this guide

More guides in Enterprise AI

V.A.U.L.T. — AI Transformation Framework

AI Compliance — HIPAA, PCI-DSS & SOC2

VertexAI vs Bedrock vs Azure AI — Comparison

You Might Also Like

How to Set Up OpenClaw in 30 Minutes

Claude Cowork Setup: Projects, Connectors & Dispatch

Gemini CLI — 1000 Free Requests/Day