Google Agents Cli Deploy

Name: Google Agents Cli Deploy
Author: google

google/agents-cli

64.2k installs
5.4k repo stars
Updated July 23, 2026
google/agents-cli

Google-agents-cli-deploy is a skill for deploying agents to Google Cloud using Agent Runtime, Cloud Run, or GKE.

About

Google-agents-cli-deploy is a skill for deploying agents to Google Cloud using Gemini Enterprise Agent Platform. It covers deployment targets including Agent Runtime, Cloud Run, and GKE, plus CI/CD pipeline configuration and secrets management for production agent workloads.

Deploy to Agent Runtime, Cloud Run, or GKE on Google Cloud
CI/CD pipeline setup and secrets management
Production workflows for agents on Gemini Enterprise

Google Agents Cli Deploy by the numbers

64,198 all-time installs (skills.sh)
+8,441 installs in the week ending Jul 28, 2026 (Skillselion tracking)
Ranked #7 of 1,453 DevOps & CI/CD skills by installs in the Skillselion catalog
Security screen: MEDIUM risk (skills.sh audit)
Data as of Jul 28, 2026 (Skillselion catalog sync)

At a glance

google-agents-cli-deploy capabilities & compatibility

Capabilities: deployment · ci cd
Works with: gcp
Use cases: ci cd · devops
Runs: Remote server
Pricing: Free

npx skills add https://github.com/google/agents-cli --skill google-agents-cli-deploy

Add your badge

Show developers this skill is listed on Skillselion. Paste this into your README.

[![Listed on Skillselion](https://skillselion.com/badge/skills/google/agents-cli/google-agents-cli-deploy.svg)](https://skillselion.com/skills/google/agents-cli/google-agents-cli-deploy)

Installs	64.2k
repo stars	★ 5.4k
Security audit	2 / 3 scanners passed
Last updated	July 23, 2026
Repository	google/agents-cli ↗

How do you deploy an ADK agent to Vertex AI?

Deploy agents to Google Cloud using Agent Runtime, Cloud Run, or GKE with CI/CD pipelines and secrets management

Who is it for?

Production deployment,CI/CD setup,Cloud infrastructure

Skip if: Unscaffolded repos, Gemini Enterprise marketplace publish flows, or local-only agent debugging without cloud deployment intent.

When should I use this skill?

The user wants to deploy an ADK agent to Vertex AI Agent Runtime or configure source-based Agent Runtime deployment.

What you get

Running Vertex AI Agent Runtime instance, deployed tarball package, and registered AdkApp operations on managed infrastructure.

Deployed Vertex AI Agent Runtime instance
Packaged agent tarball on managed infrastructure

Files

SKILL.mdMarkdownGitHub ↗

ADK Deployment Guide

Requires: agents-cli (uv tool install google-agents-cli) — install uv first if needed.

Prefer using the agents-cli commands throughout this guide — they wrap Terraform, Docker, and deployment into a tested pipeline. If your project isn't scaffolded yet, see /google-agents-cli-scaffold to add deployment support first.

Reference Files

For deeper details, consult these reference files in references/:

`cloud-run.md` — Scaling defaults, Dockerfile, session types, networking
`agent-runtime.md` — deploy.py CLI, AdkApp pattern, Terraform resource, deployment metadata, CI/CD differences
`gke.md` — GKE Autopilot cluster, Kubernetes manifests, Workload Identity, session types, networking
`terraform-patterns.md` — Custom infrastructure, IAM, state management, importing resources
`batch-inference.md` — BigQuery Remote Function trigger; for Pub/Sub / Eventarc see /google-agents-cli-adk-code
`cicd-pipeline.md` — Full CI/CD pipeline setup, infra cicd flags, runner comparison, WIF auth, pipeline stages
`testing-deployed-agents.md` — Testing instructions per deployment target, curl examples, load tests

Observability: See the /google-agents-cli-observability skill for Cloud Trace, prompt-response logging, BigQuery Analytics, and third-party integrations.

---

Deployment Target Decision Matrix

Choose the right deployment target based on your requirements:

Criteria	Agent Runtime	Cloud Run	GKE
Languages	Python	Python	Python (+ others via custom containers)
Scaling	Managed auto-scaling (configurable min/max, concurrency)	Fully configurable (min/max instances, concurrency, CPU allocation)	Full Kubernetes scaling (HPA, VPA, node auto-provisioning)
Networking	VPC-SC and PSC-I supported (private VPC connectivity via network attachments)	Full VPC support, direct VPC egress, IAP, ingress rules	Full Kubernetes networking
Session state	Native `VertexAiSessionService` (persistent, managed)	In-memory (dev), Cloud SQL, or Agent Platform Sessions backend	In-memory (dev), Cloud SQL, or Agent Platform Sessions backend
Batch/event processing	Not supported	Native trigger endpoints (Pub/Sub, Eventarc); see `/google-agents-cli-adk-code`	Custom (Kubernetes Jobs, Pub/Sub)
Cost model	vCPU-iours + memory-iours (not billed when idle)	Per-instance-second + min instance costs	Node pool costs (always-on or auto-provisioned)
Setup complexity	Lower (managed, purpose-built for agents)	Medium (Dockerfile, Terraform, networking)	Higher (Kubernetes expertise required)
Best for	Managed infrastructure, minimal ops	Custom infra, event-driven workloads	Full Kubernetes control

Ask the user which deployment target fits their needs. Each is a valid production choice with different trade-offs.

Product name mapping: "Agent Engine" / "Vertex AI Agent Engine" is now Agent Runtime. Use --deployment-target agent_runtime.

Ambient / scheduled / event-driven agents: Agent Runtime does not support Pub/Sub, Eventarc, or Cloud Scheduler triggers. Use Cloud Run (recommended) or GKE for these workloads. See /google-agents-cli-adk-code (references/adk-python.md, section "12. Event-Driven / Ambient Agents") for the trigger_sources pattern.

OAuth / user consent agents: Use Agent Runtime with Gemini Enterprise for agents that need OAuth 2.0 user consent (e.g., accessing Google Drive, Calendar, or other user-scoped APIs). Cloud Run does not currently support managed OAuth flows. See the adk-ae-oauth sample in /google-agents-cli-workflow Phase 1.

---

Deploying to Dev

Deploy Workflow

Task tracking: Deployment involves multiple sequential steps (infra setup, CI/CD configuration, deploy, verification). Use a task list to track progress through these steps — skipping one often causes failures in later steps that are hard to trace back.

1. If prototype (no deployment target), first enhance: agents-cli scaffold enhance . --deployment-target <target> 2. Notify the human: "Eval scores meet thresholds and tests pass. Ready to deploy to dev?" 3. Wait for explicit approval 4. Once approved: agents-cli deploy

Agent Runtime timeout recovery: Agent Runtime deploys can take 5-10 minutes and may exceed command timeouts. If the deploy command is cancelled or times out, the deployment continues server-side. Run agents-cli deploy --status to check progress — poll every 60 seconds until it reports completion or failure.

IMPORTANT: Never run agents-cli deploy without explicit human approval.

Do NOT run `agents-cli infra single-project` before deploying. It is not a prerequisite — agents-cli deploy works on its own. Run it separately if the user needs observability features (prompt-response logging, BigQuery analytics) — see /google-agents-cli-observability.

Single-Project Infrastructure Setup (Optional — Advanced)

agents-cli infra single-project runs terraform apply in deployment/terraform/single-project/. Use this to provision single-project GCP infrastructure without CI/CD (service accounts, IAM bindings, telemetry resources, Artifact Registry). Also useful to test things in a single project before going to production. It is NOT required for deploying.

# Optional — provision infrastructure in a single GCP project
agents-cli infra single-project

Note: agents-cli deploy doesn't automatically use the Terraform-created app_sa. Pass the service account via agents-cli deploy --service-account SA_EMAIL or uv run -m app.app_utils.deploy --service-account SA_EMAIL for Agent Runtime targets.

Deploy Flag Reference

Flag	Description	Targets
`--project`	GCP project ID	All
`--region`	GCP region	All
`--service-account`	Service account email for the deployed agent	All
`--service-name`	Override the deployed service name (Cloud Run service or Agent Runtime display name); defaults to the project name. If you override it, consider updating your Terraform and CI (if present) — they name resources from the project name. Not supported for GKE, whose names are fully owned by Terraform.	Agent Runtime, Cloud Run
`--secrets`	Comma-separated `ENV=SECRET` or `ENV=SECRET:VERSION` pairs	Agent Runtime, Cloud Run
`--update-env-vars`	Comma-separated `KEY=VALUE` environment variables	Agent Runtime, Cloud Run
`--agent-identity`	Enable agent identity (Preview)	Agent Runtime
`--network-attachment`	Network attachment resource name for PSC interface (enables private VPC connectivity)	Agent Runtime
`--dns-peering-domain`	DNS peering domain suffix, e.g. `my-internal.corp.` (requires `--network-attachment`)	Agent Runtime
`--dns-peering-project`	Project ID hosting the Cloud DNS managed zone for DNS peering (requires `--network-attachment`)	Agent Runtime
`--dns-peering-network`	VPC network name in the target project for DNS peering (requires `--network-attachment`)	Agent Runtime
`--memory`	Memory limit (default: `4Gi`)	Agent Runtime, Cloud Run
`--cpu`	CPU limit (default: `1`)	Agent Runtime, Cloud Run
`--min-instances`	Minimum number of instances (default: `1`)	Agent Runtime, Cloud Run
`--max-instances`	Maximum number of instances (default: `10`)	Agent Runtime, Cloud Run
`--concurrency`	Concurrent requests per container (default: `8`; see Sizing a deployment)	Agent Runtime, Cloud Run
`--num-workers`	Worker processes per container (default: `1`)	Agent Runtime
`--port`	Container port	Cloud Run
`--iap`	Enable Identity-Aware Proxy	Cloud Run
`--image`	Container image URI (skips source build)	Cloud Run, GKE
`--no-wait`	Start deployment and return immediately	Agent Runtime, Cloud Run
`--status`	Check the status of a pending `--no-wait` deployment	Agent Runtime, Cloud Run
`--list`	List existing deployments and exit	All
`--dry-run` / `-n`	Print what would be executed without running it	All
`--no-confirm-project`	Skip project confirmation prompt	All

Run agents-cli deploy --help for the full flag reference.

Advanced Cloud Run Deploys: If you need features not exposed via agents-cli flags, use --dry-run (or -n) to print the full gcloud command, copy it, and add additional arguments as needed.

Project Confirmation: If the project is resolved automatically (not passed via --project), the command will prompt for confirmation in interactive mode. Since agents typically run in non-interactive mode, you MUST pass --no-confirm-project to proceed if you are relying on automatic project resolution.

---

Sizing a deployment

Defaults (same on Agent Runtime, Cloud Run, and the generated service.tf): --cpu 1, --memory 4Gi, --num-workers 1, --concurrency 8, --min-instances 1, --max-instances 10.

The params are coupled — scale them together:

Workers = vCPUs. Each worker is one GIL-bound process that saturates one core, so raise --num-workers with --cpu (e.g. --cpu 4 → --num-workers 4) or you pay for idle cores.
Memory bounds concurrency. Each concurrent request keeps its full working set (context window, history, RAG chunks, response buffer) in memory while it waits on the model, so peak ≈ base + concurrency × per-request memory. Memory — not CPU — is the first limit, so raising --concurrency without --memory is the main OOM cause.
Concurrency default is conservative. An async worker can serve many concurrent requests while it waits on the model, but per-request memory is agent-specific, so 8 protects a memory-heavy (RAG/multimodal) agent. Light agents can raise it to 16–32+ after load-testing. See Underutilized asynchronous workers.

# 4x throughput: scale every param, not just one
agents-cli deploy --cpu 4 --num-workers 4 --concurrency 16 --memory 16Gi

Tune with the scaffolded load test (tests/load_test/, run locally or in the CI/CD staging pipeline): drive load, watch max latency and memory/OOM restarts, then adjust — high max latency → raise concurrency (+ workers/cpu); OOM → raise memory or lower concurrency.

--num-workers is Agent-Runtime-only (Cloud Run runs one uvicorn process). On GKE these flags are rejected — size via the Terraform manifests + HorizontalPodAutoscaler under deployment/terraform/.

---

Production Deployment — CI/CD Pipeline

For the full CI/CD pipeline setup guide — prerequisites, infra cicd flags, runner comparison, WIF authentication, pipeline stages, and production approval — see references/cicd-pipeline.md.

---

Cloud Run Specifics

For detailed infrastructure configuration (scaling defaults, Dockerfile, FastAPI endpoints, session types, networking), see references/cloud-run.md. For ADK docs on Cloud Run deployment, fetch https://adk.dev/deploy/cloud-run/index.md.

For event-driven / ambient agent deployment on Cloud Run, see the `ambient-expense-agent` sample and /google-agents-cli-adk-code (references/adk-python.md, section "12. Event-Driven / Ambient Agents") for the trigger_sources pattern.

---

Agent Runtime Specifics

Agent Runtime is a managed Vertex AI service for deploying Python ADK agents. Uses source-based deployment (no Dockerfile) via deploy.py and the AdkApp class.

No `gcloud` CLI exists for Agent Runtime. Deploy via agents-cli deploy or deploy.py. Query via the Python vertexai.Client SDK.

Deployments can take 5-10 minutes. Use --no-wait to start a deployment and return immediately, then check on it later with --status:

# Start deployment without blocking
agents-cli deploy --no-wait

# Check on progress later
agents-cli deploy --status

When --status detects the operation has completed, it writes deployment_metadata.json and prints the same success output as a normal deploy.

For detailed infrastructure configuration (deploy.py flags, AdkApp pattern, Terraform resource, deployment metadata, session/artifact services, CI/CD differences), see references/agent-runtime.md. For ADK docs on Agent Runtime deployment, fetch https://adk.dev/deploy/agent-runtime/index.md.

---

GKE Specifics

For detailed infrastructure configuration (Kubernetes manifests, Terraform resources, Workload Identity, session types, networking), see references/gke.md. For ADK docs on GKE deployment, fetch https://adk.dev/deploy/gke/index.md.

---

Service Account Architecture

Scaffolded projects use two service accounts:

`app_sa` (per environment) — Runtime identity for the deployed agent. Roles defined in deployment/terraform/iam.tf.
`cicd_runner_sa` (CI/CD project) — CI/CD pipeline identity (GitHub Actions / Cloud Build). Lives in the CI/CD project (defaults to prod project), needs permissions in both staging and prod projects.

Check deployment/terraform/iam.tf for exact role bindings. Cross-project permissions (Cloud Run service agents, artifact registry access) are also configured there.

Common 403 errors:

"Permission denied on Cloud Run" → cicd_runner_sa missing deployment role in the target project
"Cannot act as service account" → Missing iam.serviceAccountUser binding on app_sa
"Secret access denied" → app_sa missing secretmanager.secretAccessor
"Cloud SQL connection failed / Not authorized" → Runtime service account missing roles/cloudsql.client
"Artifact Registry read denied" → Cloud Run service agent missing read access in CI/CD project

---

Required Permissions for CI/CD Setup

`roles/secretmanager.admin` granted to the Cloud Build service account (service-<PROJECT_NUMBER>@gcp-sa-cloudbuild.iam.gserviceaccount.com) in the CI/CD project. This allows Cloud Build to access the GitHub token stored in Secret Manager.

---

Required APIs

The following Google Cloud APIs must be enabled in your project for the skills and deployment to work:

`cloudbuild.googleapis.com` — Required for building container images and running CI/CD pipelines.
`secretmanager.googleapis.com` — Required for managing secrets and API keys.
`run.googleapis.com` — Required for deploying to Cloud Run.

Ensure these are enabled before running deployment or CI/CD setup commands:

gcloud services enable cloudbuild.googleapis.com secretmanager.googleapis.com run.googleapis.com --project=YOUR_PROJECT_ID

---

Secret Manager (for API Credentials)

Instead of passing sensitive keys as environment variables, use GCP Secret Manager.

# Create a secret
echo -n "YOUR_API_KEY" | gcloud secrets create MY_SECRET_NAME --data-file=-

# Update an existing secret
echo -n "NEW_API_KEY" | gcloud secrets versions add MY_SECRET_NAME --data-file=-

Grant access: For Cloud Run, grant secretmanager.secretAccessor to app_sa. For Agent Runtime, grant it to the platform-managed SA (service-PROJECT_NUMBER@gcp-sa-aiplatform-re.iam.gserviceaccount.com). For GKE, grant secretmanager.secretAccessor to app_sa. Access secrets via Kubernetes Secrets or directly via the Secret Manager API with Workload Identity.

Pass secrets at deploy time (Agent Runtime, Cloud Run):

agents-cli deploy --secrets "API_KEY=my-api-key,DB_PASS=db-password:2"

Format: ENV_VAR=SECRET_ID or ENV_VAR=SECRET_ID:VERSION (defaults to latest). Access in code via os.environ.get("API_KEY").

---

Cloud SQL Permissions (Manual Deployment)

When using Cloud SQL with Cloud Run in a manual deployment (e.g., adding --add-cloudsql-instances in non-Terraform setups), you must manually grant the Cloud SQL Client role to the runtime service account.

Without this, the deployment may succeed but fail at runtime with cloudsql.instances.get authorization errors.

gcloud projects add-iam-policy-binding YOUR_PROJECT_ID \
  --member="serviceAccount:YOUR_RUNTIME_SA_EMAIL" \
  --role="roles/cloudsql.client"

Note: In full Terraform-managed setups (infra cicd / infra single-project), this role is configured and managed automatically.

---

Observability

See the agents-cli-observability skill for observability configuration (Cloud Trace, prompt-response logging, BigQuery Analytics, third-party integrations).

---

Testing Your Deployed Agent

The quickest way to test a deployed agent is agents-cli run --url <service-url> --mode <a2a|adk> "your prompt" — it handles auth, sessions, and streaming automatically (supports Agent Runtime and Cloud Run).

For advanced testing (custom headers, session reuse, scripting, load tests), see references/testing-deployed-agents.md.

---

Deploying with a UI (IAP)

IAP (Identity-Aware Proxy) secures a Cloud Run service so only authorized Google accounts can access it. Enable it by adding the --iap flag when deploying (Cloud Run only): agents-cli deploy --iap.

For Agent Runtime with a custom frontend, use a decoupled deployment — deploy the frontend separately to Cloud Run or Cloud Storage, connecting to the Agent Runtime backend API.

For more information on IAP with Cloud Run, see the Cloud Console IAP settings.

---

Rollback & Recovery

The primary rollback mechanism is git-based: fix the issue, commit, and push to main. The CI/CD pipeline will automatically build and deploy the new version through staging → production.

For immediate Cloud Run rollback without a new commit, use revision traffic shifting:

gcloud run revisions list --service=SERVICE_NAME --region=REGION
gcloud run services update-traffic SERVICE_NAME \
  --to-revisions=REVISION_NAME=100 --region=REGION

Agent Runtime doesn't support revision-based rollback — fix and redeploy via agents-cli deploy.

For GKE rollback, use kubectl rollout undo:

kubectl rollout undo deployment/DEPLOYMENT_NAME -n NAMESPACE
kubectl rollout status deployment/DEPLOYMENT_NAME -n NAMESPACE

---

Custom Infrastructure (Terraform)

CRITICAL: When your agent requires custom infrastructure (Cloud SQL, Pub/Sub, Eventarc, BigQuery, etc.), you MUST define it in Terraform — never create resources manually via gcloud commands. Exception: quick experimentation is fine with gcloud or console, but production infrastructure must be in Terraform.

For custom infrastructure patterns, consult references/terraform-patterns.md for:

Where to put custom Terraform files (single-project vs CI/CD)
Resource examples (Pub/Sub, BigQuery, Eventarc triggers)
IAM bindings for custom resources
Terraform state management (remote vs local, importing resources)
Common infrastructure patterns

---

Troubleshooting

Issue	Solution
Terraform state locked	`terraform force-unlock -force LOCK_ID` in deployment/terraform/
GitHub Actions auth failed	Re-run `terraform apply` in CI/CD terraform dir; verify WIF pool/provider
Cloud Build authorization pending	Use `github_actions` runner instead
Resource already exists	`terraform import` (see `references/terraform-patterns.md`)
Agent Runtime deploy timeout / hangs	Deployments take 5-10 min; check if engine was created (see Agent Runtime Specifics)
Secret not available	Verify `secretAccessor` granted to `app_sa` (not the default compute SA)
Cloud SQL connection failed / 403	Grant `roles/cloudsql.client` to the runtime service account when using manual deployments
403 on deploy	Check `deployment/terraform/iam.tf` — `cicd_runner_sa` needs deployment + SA impersonation roles in the target project
403 when testing Cloud Run	Default is `--no-allow-unauthenticated`; include `Authorization: Bearer $(gcloud auth print-identity-token)` header
Cold starts too slow	Set `min_instance_count > 0` in Cloud Run Terraform config
Cloud Run 503 errors	Check resource limits (memory/CPU), increase `max_instance_count`, or check container crash logs
403 right after granting IAM role	IAM propagation is not instant — wait a couple of minutes before retrying. Don't keep re-granting the same role
Resource seems missing but Terraform created it	Run `terraform state list` to check what Terraform actually manages. Resources created via `null_resource` + `local-exec` (e.g., BQ linked datasets) won't appear in `gcloud` CLI output
Deployment failed or agent not responding	Check Cloud Logging: `gcloud logging read "resource.type=cloud_run_revision AND resource.labels.service_name=SERVICE" --project=PROJECT --limit=50 --format="table(timestamp,severity,textPayload)"` for Cloud Run, or `gcloud logging read "resource.type=aiplatform.googleapis.com/ReasoningEngine" --project=PROJECT --limit=50` for Agent Runtime
Agent returns errors after deploy	Open Cloud Logging in Console → filter by service name (Cloud Run) or reasoning engine resource (Agent Runtime) → look for Python tracebacks or permission errors in recent log entries

---

Platform Registration

For registering deployed agents with Gemini Enterprise, see /google-agents-cli-publish.

---

Related Skills

/google-agents-cli-workflow — Development workflow, coding guidelines, and operational rules
/google-agents-cli-adk-code — ADK Python API quick reference for writing agent code
/google-agents-cli-eval — Evaluation methodology, dataset schema, and the eval-fix loop
/google-agents-cli-scaffold — Project creation and enhancement with agents-cli scaffold create / scaffold enhance
/google-agents-cli-observability — Cloud Trace, logging, BigQuery Analytics, and third-party integrations
/google-agents-cli-publish — Gemini Enterprise registration

Agent Runtime Infrastructure

Assumes `/google-agents-cli-scaffold` scaffolding. If your project isn't scaffolded yet, see /google-agents-cli-scaffold first.

Deployment Architecture

Agent Runtime uses source-based deployment — no Docker container or Dockerfile. Your agent code is packaged as a base64-encoded tarball and deployed directly to the managed Vertex AI service.

App class: Your agent extends AdkApp (from vertexai.agent_engines.templates.adk). Check agent_runtime_app.py for the exact implementation. Key methods:

set_up() — Initialization (Vertex AI client, telemetry)
register_operations() — Declare operations exposed to Agent Runtime
register_feedback() — Collect and log user feedback
async_stream_query() — Streaming response method

deploy.py CLI

Scaffolded projects deploy via uv run -m app.app_utils.deploy. Run uv run -m app.app_utils.deploy --help for the full flag reference.

Deployment flow: 1. uv export generates .requirements.txt from lockfile 2. deploy.py packages source, creates/updates the Agent Runtime instance 3. Writes deployment_metadata.json with the engine resource ID

Terraform Resource

Agent Runtime uses google_vertex_ai_reasoning_engine in deployment/terraform/service.tf. Check that file for current scaling, concurrency, and resource limit settings.

Key difference from Cloud Run: the lifecycle.ignore_changes on source_code_spec is critical — source code is updated by CI/CD, not Terraform.

deployment_metadata.json

Written by deploy.py after successful deployment:

{
  "remote_agent_runtime_id": "projects/PROJECT/locations/LOCATION/reasoningEngines/ENGINE_ID",
  "deployment_target": "agent_runtime",
  "is_a2a": false,
  "deployment_timestamp": "2025-02-25T10:30:00.000Z"
}

Used by: subsequent deploys (update vs create), testing notebook, agents-cli run --url. Cloud Run does not use this file.

If deployment times out but the engine was created, manually populate this file with the engine resource ID.

CI/CD Differences from Cloud Run

Aspect	Agent Runtime	Cloud Run
Build	`uv export` → requirements file	Docker build → container image
Deploy command	`uv run -m app.app_utils.deploy`	`gcloud run deploy --image ...`
Artifact	Base64 source tarball	Container image in Artifact Registry
Python version	Fixed at 3.12 (Terraform)	Configurable in Dockerfile
Load testing	Via `locust` against Agent Runtime endpoint	Direct HTTP to Cloud Run URL

Playground & Remote Testing

# Local mode (uses local agent instance)
agents-cli playground

# Query your deployed Agent Runtime remotely (ADK agent)
agents-cli run --url https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT/locations/LOCATION/reasoningEngines/ID --mode adk "Hello, what can you do?"

--mode is required with --url: use adk for the ADK streaming API (:streamQuery) or a2a for the A2A protocol. Add -v for full JSON event payloads. Auth is auto-detected via Google Cloud credentials.

To query Agent Runtime programmatically:

import vertexai

client = vertexai.Client(location="us-east1")
agent = client.agent_engines.get(name="projects/PROJECT/locations/LOCATION/reasoningEngines/ENGINE_ID")

async for event in agent.async_stream_query(message="Hello!", user_id="test"):
    print(event)

Session & Artifact Services

Service	Configuration	Notes
Sessions	`InMemorySessionService` (default)	Stateless; state per connection
Sessions	`VertexAiSessionService`	Native managed sessions (persistent)
Artifacts	`GcsArtifactService`	Uses `LOGS_BUCKET_NAME` env var
Artifacts	`InMemoryArtifactService`	Fallback when no bucket configured

Environment variables set during deployment are configured in deploy.py and deployment/terraform/service.tf. Check those files for current values.

Memory Bank

To enable cross-session memory on Agent Runtime, configure memory_bank_config via context_spec. See the `memory-bank` sample for the full pattern.

Networking (PSC Interface)

Agent Runtime cannot reach your VPC by default. To enable private connectivity, create a network attachment and deploy with --network-attachment. Add --dns-peering-domain, --dns-peering-project, and --dns-peering-network if you need private DNS resolution. PSC config is immutable after deployment — delete and redeploy to change it. See the GCP docs for prerequisites and setup.

Batch Inference (Cloud Run)

Invoke an ADK agent as a BigQuery Remote Function for batch inference over table rows. This requires a custom POST / endpoint since BQ cannot use URL paths.

For event-driven triggers (Pub/Sub, Eventarc), use ADK's native trigger_sources — see /google-agents-cli-adk-code.

BigQuery Remote Function

BQ sends {"calls": [["row1"], ...], "caller": "..."}, expects {"replies": ["...", ...]} in same order. BQ cannot use URL paths — register at POST /.

import asyncio, json, uuid
from fastapi import Request
from google.adk.runners import Runner
from google.adk.sessions import InMemorySessionService
from google.genai import types
from my_agent.agent import root_agent

APP_NAME = "my_agent"
_trigger_session_service = InMemorySessionService()
_trigger_runner = Runner(
    agent=root_agent, app_name=APP_NAME, session_service=_trigger_session_service,
)

async def _run_agent(message_text: str, user_id: str = "trigger") -> list:
    session = await _trigger_session_service.create_session(
        app_name=APP_NAME, user_id=user_id, session_id=str(uuid.uuid4())
    )
    events = []
    async for event in _trigger_runner.run_async(
        user_id=user_id, session_id=session.id,
        new_message=types.Content(role="user", parts=[types.Part(text=message_text)]),
    ):
        events.append(event)
    return events

@app.post("/")
async def trigger_bq(request: Request):
    body = await request.json()
    calls: list = body.get("calls", [])
    user_id = body.get("caller") or body.get("sessionUser") or "bq"

    async def _process_row(row_args: list) -> str:
        text = row_args[0] if (len(row_args) == 1 and isinstance(row_args[0], str)) \
               else json.dumps(row_args)
        try:
            events = await _run_agent(text, user_id=user_id)
            return json.dumps([e.model_dump(mode="json") for e in events])
        except Exception as e:
            return f"Error: {e}"

    replies = await asyncio.gather(*[_process_row(row) for row in calls])
    return {"replies": list(replies)}

BQ remote function Terraform:

resource "google_bigquery_routine" "my_fn" {
  routine_type    = "SCALAR_FUNCTION"
  language        = "SQL"
  definition_body = ""
  arguments {
    name          = "message"
    argument_kind = "FIXED_TYPE"
    data_type     = jsonencode({ typeKind = "STRING" })
  }
  return_type = jsonencode({ typeKind = "STRING" })
  remote_function_options {
    endpoint   = google_cloud_run_v2_service.app.uri  # root URL only
    connection = google_bigquery_connection.my_conn.name
  }
}

Production Deployment — CI/CD Pipeline

Best for: Production applications, teams requiring staging → production promotion.

Prerequisites: 1. Project must NOT be in a gitignored folder 2. User must provide staging and production GCP project IDs 3. GitHub repository name and owner

Steps: 1. If prototype, first add Terraform/CI-CD files using the Agents CLI (see /google-agents-cli-scaffold for full options):

   agents-cli scaffold enhance . --cicd-runner github_actions

2. Ensure you're logged in to GitHub CLI:

   gh auth login  # (skip if already authenticated)

3. Run infra cicd:

   agents-cli infra cicd \
     --staging-project YOUR_STAGING_PROJECT \
     --prod-project YOUR_PROD_PROJECT \
     --repository-name YOUR_REPO_NAME \
     --create

4. Push code to trigger deployments

Key `infra cicd` Flags

Flag	Required	Description
`--staging-project`	Yes	GCP project ID for staging environment
`--prod-project`	Yes	GCP project ID for production environment
`--repository-name`	Yes	GitHub repository name
`--create`	No	Create a new GitHub repository. Omit to use an existing one (the command verifies the repository exists either way)
`--repository-owner`	No	GitHub repo owner. Defaults to your `gh` CLI user — set this when creating under (or pointing to) a GitHub organization or another user's account
`--cicd-project`	No	Separate GCP project for CI/CD infrastructure. Defaults to prod project
`--region`	No	GCP region. Auto-detected or defaults to `us-east1`
`--local-state`	No	Store Terraform state locally instead of in GCS (see `references/terraform-patterns.md`)

Run agents-cli infra cicd --help for the full flag reference (Cloud Build options, dev project, region, etc.).

Choosing a CI/CD Runner

Runner	Pros	Cons
github_actions (Default)	No PAT needed, uses `gh auth`, WIF-based, fully automated	Requires GitHub CLI authentication
google_cloud_build	Native GCP integration	Requires `--github-pat` and `--github-app-installation-id` in programmatic mode (or `-i` for interactive OAuth flow)

Cloud Build Example

agents-cli infra cicd \
  --staging-project YOUR_STAGING_PROJECT \
  --prod-project YOUR_PROD_PROJECT \
  --repository-name YOUR_REPO_NAME \
  --create \
  --github-pat YOUR_PAT \
  --github-app-installation-id YOUR_APP_ID

How Authentication Works (WIF)

Both runners use Workload Identity Federation (WIF) — GitHub/Cloud Build OIDC tokens are trusted by a GCP Workload Identity Pool, which grants cicd_runner_sa impersonation. No long-lived service account keys needed. Terraform in infra cicd creates the pool, provider, and SA bindings automatically. If auth fails, re-run terraform apply in the CI/CD Terraform directory.

CI/CD Pipeline Stages

The pipeline has three stages:

1. CI (PR checks) — Triggered on pull request. Runs unit and integration tests. 2. Staging CD — Triggered on merge to main. Builds container, deploys to staging, runs load tests.

Path filter: Staging CD only triggers when relevant paths change — the agent directory (app/** by default), data_ingestion/**, tests/**, deployment/**, or uv.lock. The first push after infra cicd won't trigger staging CD unless one of these changes. If nothing happens after pushing, this is why.

3. Production CD — Triggered after successful staging deploy via workflow_run. Might require manual approval before deploying to production.

Approving: Go to GitHub Actions → the production workflow run → click "Review deployments" → approve the pending production environment. This is GitHub's environment protection rules, not a custom mechanism.

IMPORTANT: infra cicd creates infrastructure but doesn't deploy automatically. Terraform configures all required GitHub secrets and variables (WIF credentials, project IDs, service accounts). Push code to trigger the pipeline:

git add . && git commit -m "Initial agent implementation"
git push origin main

To approve production deployment:

# GitHub Actions: Approve via repository Actions tab (environment protection rules)

# Cloud Build: Find pending build and approve
gcloud builds list --project=PROD_PROJECT --region=REGION --filter="status=PENDING"
gcloud builds approve BUILD_ID --project=PROD_PROJECT

Non-GitHub Providers (GitLab, Bitbucket, etc.)

The agents-cli infra cicd command only supports GitHub. It requires the gh CLI, uses the Terraform github provider, and both CI/CD runners (GitHub Actions, Cloud Build) assume a GitHub source repo.

For other git providers, use the scaffolded Terraform as a starting point: 1. Run agents-cli scaffold enhance to generate the Terraform and CI/CD files 2. Replace the github provider and resources in deployment/terraform/cicd/ with your provider's equivalents 3. Adapt the CI/CD pipeline files (e.g., replace .github/workflows/ with .gitlab-ci.yml) 4. Run terraform apply directly instead of agents-cli infra cicd

Cloud Run Infrastructure

Assumes `/google-agents-cli-scaffold` scaffolding. If your project isn't scaffolded yet, see /google-agents-cli-scaffold first.

Scaling & Resource Defaults

Agents CLI scaffolds Cloud Run infrastructure in deployment/terraform/single-project/service.tf (and the cicd/ variant). Check that file for current resource limits, scaling configuration, concurrency, and session affinity settings.

Key settings to be aware of: cpu_idle (CPU allocation strategy), min_instance_count (cold start avoidance), max_instance_request_concurrency (concurrency per instance), and session_affinity (sticky routing).

For how to size cpu/memory/workers/concurrency together (and avoid OOM), see Sizing a deployment in the /google-agents-cli-deploy skill.

Dockerfile

Scaffolded projects include a Dockerfile using single-stage build with uv for dependency management. Check the project root Dockerfile for the exact configuration.

FastAPI Endpoints

Available endpoints vary by project template. Check app/fast_api_app.py for the exact routes in your project.

Session Types

Type	Configuration	Use Case
In-memory	Default (`session_service_uri = None`)	Local dev only; lost on instance restart
Cloud SQL	`--session-type cloud_sql` at scaffold time	Production persistent sessions (Postgres 15, IAM auth)
Agent Runtime	`session_service_uri = agentengine://{resource_name}`	When using Agent Runtime as session backend

Cloud SQL session infrastructure (instance, database, Cloud SQL Unix socket volume mount) is configured in deployment/terraform/single-project/service.tf.

Manual Deployment Warning: When using Cloud SQL without Terraform (e.g., direct gcloud run deploy with --add-cloudsql-instances), you MUST manually grant roles/cloudsql.client to the runtime service account, otherwise the connection will fail with authorization errors.

Network & Ingress

Default ingress is INGRESS_TRAFFIC_ALL (public). To restrict, change the ingress setting in service.tf to INGRESS_TRAFFIC_INTERNAL_ONLY (VPC only) or INGRESS_TRAFFIC_INTERNAL_LOAD_BALANCER (internal + GCLB).

IAP (Identity-Aware Proxy) can be enabled by running agents-cli deploy --iap (Cloud Run only), which adds Google identity authentication without code changes. IAP is configured by the deploy flag, not by a generated Terraform variable.

VPC connectors are not configured by default. Add them in custom Terraform if needed for private resource access (see references/terraform-patterns.md).

GKE Infrastructure

Assumes `agents-cli` scaffolding. If your project isn't scaffolded yet, see /google-agents-cli-scaffold first.

Deployment Architecture

GKE uses container-based deployment to a managed GKE Autopilot cluster. Your agent is packaged as a Docker container (same Dockerfile as Cloud Run), pushed to Artifact Registry, and deployed via Terraform-managed Kubernetes resources.

Dockerfile

Scaffolded projects include a Dockerfile using single-stage build with uv for dependency management — same as Cloud Run. Check the project root Dockerfile for the exact configuration.

Kubernetes Resources (Terraform-Managed)

All Kubernetes resources are managed by Terraform in deployment/terraform/cicd/service.tf (staging/prod) and deployment/terraform/single-project/service.tf (single-project). CI/CD pipelines only update the container image via kubectl set image.

Resource	Purpose
`kubernetes_deployment_v1`	Pod spec, container config, resource requests/limits, startup/readiness/liveness probes, env vars, optional Cloud SQL proxy sidecar
`kubernetes_service_v1`	LoadBalancer service exposing port 8080
`kubernetes_horizontal_pod_autoscaler_v2`	HorizontalPodAutoscaler (2-10 replicas, 70% CPU target)
`kubernetes_pod_disruption_budget_v1`	PodDisruptionBudget (minAvailable: 1)
`kubernetes_service_account_v1`	Kubernetes ServiceAccount for Workload Identity
`kubernetes_namespace_v1`	Namespace for the application
`kubernetes_secret_v1`	DB password secret (Cloud SQL only)

Terraform Infrastructure

GKE infrastructure is provisioned in deployment/terraform/single-project/service.tf. Check that file for current configuration.

Key differences from Cloud Run: Terraform provisions a full networking stack (VPC, subnet, Cloud NAT for private node internet access) and a GKE Autopilot cluster with private nodes. Cloud SQL (optional, when session_type == "cloud_sql") uses a proxy sidecar in the pod rather than Cloud Run's Unix socket volume mount.

Workload Identity

GKE uses Workload Identity to map Kubernetes service accounts to GCP service accounts. The Kubernetes SA is annotated with the GCP app_sa email and bound via an iam.workloadIdentityUser IAM binding in Terraform.

This lets pods authenticate as app_sa without service account keys — same security model as Cloud Run's service identity, but configured through Kubernetes.

Session Types

Type	Configuration	Use Case
In-memory	Default (`session_service_uri = None`)	Local dev only; lost on pod restart
Cloud SQL	`--session-type cloud_sql` at scaffold time	Production persistent sessions (Cloud SQL proxy sidecar in pod)
Agent Runtime	`session_service_uri = agentengine://{resource_name}`	When using Agent Runtime as session backend

Cloud SQL in GKE uses a proxy sidecar container in the pod (unlike Cloud Run which uses a Unix socket volume mount). The sidecar is configured in the kubernetes_deployment_v1 Terraform resource.

FastAPI Endpoints

Available endpoints vary by project template. Check app/fast_api_app.py for the exact routes in your project.

Testing Your Deployed Agent

GKE LoadBalancer services are internal by default — they are not accessible from outside the VPC. Use kubectl port-forward to access the service locally:

# Start port-forward (runs in background)
kubectl port-forward svc/SERVICE_NAME 8080:8080 -n NAMESPACE &

# Test health endpoint
curl "http://127.0.0.1:8080/"

# Create a session
curl -X POST "http://127.0.0.1:8080/apps/app/users/test-user/sessions" \
  -H "Content-Type: application/json" \
  -d '{}'

# Send a message via SSE streaming
curl -X POST "http://127.0.0.1:8080/run_sse" \
  -H "Content-Type: application/json" \
  -d '{
    "app_name": "app",
    "user_id": "test-user",
    "session_id": "SESSION_ID",
    "new_message": {"role": "user", "parts": [{"text": "Hello!"}]}
  }'

Network & Ingress

GKE LoadBalancer services are internal by default (the cloud.google.com/load-balancer-type: "Internal" annotation is set in Terraform). The internal IP is used for pod-to-pod A2A communication within the cluster. Use kubectl port-forward for local access.

Custom Infrastructure (Terraform)

Assumes `/google-agents-cli-scaffold` scaffolding. These patterns apply to projects with deployment/terraform/ directories.

Where to Put Custom Terraform

Scenario	Location
Single-project infrastructure	`deployment/terraform/single-project/`
CI/CD environments (staging/prod)	`deployment/terraform/cicd/`

Example: Custom Resources

# deployment/terraform/single-project/custom_resources.tf

resource "google_pubsub_topic" "events" {
  name    = "${var.project_name}-events"
  project = var.project_id
}

resource "google_bigquery_dataset" "analytics" {
  dataset_id = "${replace(var.project_name, "-", "_")}_analytics"
  project    = var.project_id
  location   = var.region
}

# Eventarc trigger for Cloud Storage
resource "google_eventarc_trigger" "storage_trigger" {
  name     = "${var.project_name}-storage-trigger"
  location = var.region
  project  = var.project_id

  matching_criteria {
    attribute = "type"
    value     = "google.cloud.storage.object.v1.finalized"
  }
  matching_criteria {
    attribute = "bucket"
    value     = google_storage_bucket.uploads.name
  }

  destination {
    cloud_run_service {
      service = google_cloud_run_v2_service.app.name
      region  = var.region
      path    = "/apps/${var.project_name}/trigger/eventarc"
    }
  }

  service_account = google_service_account.app_sa.email
}

For CI/CD environments (staging/prod):

Add resources to deployment/terraform/cicd/ (applies to staging and prod):

# deployment/terraform/cicd/custom_resources.tf
# Resources here are created in BOTH staging and prod projects
# Use for_each with local.deploy_project_ids for multi-environment

resource "google_pubsub_topic" "events" {
  for_each = local.deploy_project_ids
  name     = "${var.project_name}-events"
  project  = each.value
}

IAM for Custom Resources

Single-project (deployment/terraform/single-project/):

resource "google_pubsub_topic_iam_member" "app_publisher" {
  topic   = google_pubsub_topic.events.name
  project = var.project_id
  role    = "roles/pubsub.publisher"
  member  = "serviceAccount:${google_service_account.app_sa.email}"
}

# Grant BigQuery data editor
resource "google_bigquery_dataset_iam_member" "app_editor" {
  dataset_id = google_bigquery_dataset.analytics.dataset_id
  project    = var.project_id
  role       = "roles/bigquery.dataEditor"
  member     = "serviceAccount:${google_service_account.app_sa.email}"
}

CI/CD (deployment/terraform/cicd/) — use for_each to apply across environments:

resource "google_pubsub_topic_iam_member" "app_publisher" {
  for_each = local.deploy_project_ids
  topic    = google_pubsub_topic.events[each.key].name
  project  = each.value
  role     = "roles/pubsub.publisher"
  member   = "serviceAccount:${google_service_account.app_sa[each.key].email}"
}

Applying Custom Infrastructure

# For single-project infrastructure
agents-cli infra single-project  # Runs terraform apply in deployment/terraform/single-project/

# For CI/CD, infrastructure is applied automatically on push

Common Patterns

Cloud Storage trigger (Eventarc):

Create bucket in Terraform
Create Eventarc trigger pointing to /apps/{app_name}/trigger/eventarc endpoint
Grant eventarc.eventReceiver role to app service account

Pub/Sub processing:

Create topic and push subscription in Terraform
Point subscription to /apps/{app_name}/trigger/pubsub endpoint
Grant iam.serviceAccountTokenCreator role for push auth

BigQuery Remote Function:

Create BigQuery connection in Terraform
Grant connection service account permission to invoke Cloud Run
Create the remote function via SQL after deployment

Cloud SQL sessions:

Already configured when using --session-type cloud_sql via the Agents CLI (see /google-agents-cli-scaffold)
Additional tables/schemas can be added via migration scripts

Terraform State Management

Remote State (Default)

By default, infra cicd creates a GCS bucket for remote Terraform state:

# Auto-configured backend in deployment/terraform/cicd/backend.tf
terraform {
  backend "gcs" {
    bucket = "{cicd_project}-terraform-state"
    prefix = "{repository_name}/{prod|single-project}"
  }
}

The state bucket is named {cicd_project}-terraform-state and uses the repository name + environment as the prefix to isolate state per project and environment.

Local State

Use the --local-state flag with infra cicd to skip remote backend setup and store state locally:

agents-cli infra cicd \
  --staging-project STAGING_PROJECT \
  --prod-project PROD_PROJECT \
  --repository-name REPO_NAME \
  --create \
  --local-state

Local state is stored in deployment/terraform/cicd/terraform.tfstate. This is suitable for single-developer projects but not recommended for teams (state conflicts).

Importing Existing Resources

If resources already exist (e.g., created manually or by a previous deployment), import them into Terraform state:

# Import a Cloud Run service
cd deployment/terraform/single-project
terraform import google_cloud_run_v2_service.app \
  projects/PROJECT_ID/locations/REGION/services/SERVICE_NAME

# Import a service account
terraform import google_service_account.app_sa \
  projects/PROJECT_ID/serviceAccounts/SA_EMAIL

# Import a secret
terraform import google_secret_manager_secret.my_secret \
  projects/PROJECT_ID/secrets/SECRET_NAME

After importing, run terraform plan to verify the imported state matches the configuration. Fix any drift before applying.

Testing Your Deployed Agent

Quick Test (Recommended)

The fastest way to test any deployed agent is the run --url command — it handles authentication, session creation, and streaming automatically:

# A2A protocol
agents-cli run --url https://my-agent-abc123.run.app --mode a2a "Hello, what can you do?"

# ADK streaming API
agents-cli run --url https://my-agent-abc123.run.app --mode adk "Hello, what can you do?"

# Agent Runtime (auto-detected from URL — works with either mode)
agents-cli run --url https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT/locations/LOCATION/reasoningEngines/ID --mode adk "Hello!"

# Custom auth header (overrides auto-detected credentials)
agents-cli run --url https://my-agent.run.app --mode a2a -H "Authorization: Bearer my-token" "Hello!"

The --mode flag is required with --url: use adk for the ADK streaming API (/run_sse, or :streamQuery for Agent Runtime) or a2a for the A2A protocol. Agent Runtime URLs are detected automatically. Add -v for full JSON event payloads.

Auth is auto-detected via Google Cloud credentials. Use --header / -H to override.

For more control (scripting, direct curl), see the target-specific sections below.

---

Agent Runtime Deployment

Beyond the run --url quick test above, you can query the deployment directly.

Option 1: Python Script

import json
import vertexai

with open("deployment_metadata.json") as f:
    engine_id = json.load(f)["remote_agent_runtime_id"]

client = vertexai.Client(location="us-east1")
agent = client.agent_engines.get(name=engine_id)

async for event in agent.async_stream_query(message="Hello!", user_id="test"):
    print(event)

Option 2: Playground

agents-cli playground

Cloud Run Deployment

Auth required by default. Cloud Run deploys with --no-allow-unauthenticated, so all requests need an Authorization: Bearer header with an identity token. Getting a 403? You're likely missing this header. To allow public access, redeploy with --allow-unauthenticated.

SERVICE_URL="https://SERVICE_NAME-PROJECT_NUMBER.REGION.run.app"
AUTH="Authorization: Bearer $(gcloud auth print-identity-token)"

# Test health endpoint
curl -H "$AUTH" "$SERVICE_URL/"

# Step 1: Create a session (required before sending messages)
curl -X POST "$SERVICE_URL/apps/app/users/test-user/sessions" \
  -H "Content-Type: application/json" \
  -H "$AUTH" \
  -d '{}'
# → returns JSON with "id" — use this as SESSION_ID below

# Step 2: Send a message via SSE streaming
curl -X POST "$SERVICE_URL/run_sse" \
  -H "Content-Type: application/json" \
  -H "$AUTH" \
  -d '{
    "app_name": "app",
    "user_id": "test-user",
    "session_id": "SESSION_ID",
    "new_message": {"role": "user", "parts": [{"text": "Hello!"}]}
  }'

Common mistake: Using {"message": "Hello!", "user_id": "...", "session_id": "..."} returns 422 Field required. The ADK HTTP server expects the new_message / parts schema shown above, and the session must already exist.

GKE Deployment

GKE LoadBalancer services are internal by default. See references/gke.md for curl examples and endpoint details.

Load Tests

See tests/load_test/README.md for configuration, default settings, and CI/CD integration details. Load tests run automatically during the staging CD pipeline stage.

Related skills

Azure DeploySafely execute production deployments of already-prepared applications to Microsoft Azure.478k1.3k

Azure ValidateRun deep pre-deployment checks on Azure configuration, infrastructure definitions, RBAC roles, and managed identities before pushing to production.477k1.3k

Github Actions DocsGet precise, docs-grounded answers about GitHub Actions workflows, syntax, security, and migration instead of relying on stale knowledge.275k72

Setup Pre CommitAutomatically run Prettier, type checking, and tests on every commit via Husky and lint-staged.161k188k

Deploy To VercelSafely turn any local project into a live Vercel preview with one instruction.97.8k29.5k

Vercel Cli With TokensDeploy projects to Vercel from agents and scripts using token authentication instead of interactive browser login.73.4k29.5k

How it compares

Use google-agents-cli-deploy for Vertex AI runtime hosting; use google-agents-cli-publish to register the deployed agent with Gemini Enterprise.

FAQ

Does ADK Agent Runtime deployment require Docker?

google-agents-cli-deploy states Agent Runtime uses source-based deployment. Agent code is packaged as a base64-encoded tarball and deployed directly to Vertex AI without a Docker container or Dockerfile.

Which base class do deployable ADK agents extend?

google-agents-cli-deploy specifies agents extend AdkApp from vertexai.agent_engines.templates.adk, implementing set_up for initialization and register_operations to declare operations exposed to Agent Runtime.

What must exist before using google-agents-cli-deploy?

google-agents-cli-deploy assumes the project was created with google-agents-cli-scaffold scaffolding. Unscaffolded repos should run scaffold create first before attempting Vertex AI Agent Runtime deployment.

Is Google Agents Cli Deploy safe to install?

skills.sh reports 2 of 3 security scanners passed. Review the Security Audits panel on this page before installing in production.

DevOps & CI/CDdeployinfra

About

Google Agents Cli Deploy by the numbers

google-agents-cli-deploy capabilities & compatibility

Add your badge

How do you deploy an ADK agent to Vertex AI?

Who is it for?

When should I use this skill?

What you get

Files

ADK Deployment Guide

Reference Files

Deployment Target Decision Matrix

Deploying to Dev

Deploy Workflow

Single-Project Infrastructure Setup (Optional — Advanced)

Deploy Flag Reference

Sizing a deployment

Production Deployment — CI/CD Pipeline

Cloud Run Specifics

Agent Runtime Specifics

GKE Specifics

Service Account Architecture

Required Permissions for CI/CD Setup

Required APIs

Secret Manager (for API Credentials)

Cloud SQL Permissions (Manual Deployment)

Observability

Testing Your Deployed Agent

Deploying with a UI (IAP)

Rollback & Recovery

Custom Infrastructure (Terraform)

Troubleshooting

Platform Registration

Related Skills

Agent Runtime Infrastructure

Deployment Architecture

deploy.py CLI

Terraform Resource

deployment_metadata.json

CI/CD Differences from Cloud Run

Playground & Remote Testing

Session & Artifact Services

Memory Bank

Networking (PSC Interface)

Batch Inference (Cloud Run)

BigQuery Remote Function

Production Deployment — CI/CD Pipeline

Key infra cicd Flags

Choosing a CI/CD Runner

Cloud Build Example

How Authentication Works (WIF)

CI/CD Pipeline Stages

Non-GitHub Providers (GitLab, Bitbucket, etc.)

Cloud Run Infrastructure

Scaling & Resource Defaults

Dockerfile

FastAPI Endpoints

Session Types

Network & Ingress

GKE Infrastructure

Deployment Architecture

Dockerfile

Kubernetes Resources (Terraform-Managed)

Terraform Infrastructure

Workload Identity

Session Types

FastAPI Endpoints

Testing Your Deployed Agent

Network & Ingress

Custom Infrastructure (Terraform)

Where to Put Custom Terraform

Example: Custom Resources

IAM for Custom Resources

Applying Custom Infrastructure

Common Patterns

Terraform State Management

Remote State (Default)

Local State

Importing Existing Resources

Testing Your Deployed Agent

Key `infra cicd` Flags