Google ADK

Google Agent Development Kit (ADK) is an open-source Python framework from Google for building, evaluating, and deploying AI agents and multi-agent systems. ADK provides a composable architecture: agents are defined as Python classes or functions, tools are type-annotated callables, and orchestration uses either sequential, loop, or parallel workflows. ADK integrates natively with Vertex AI and Gemini models, making it the natural choice when building agents on GCP. For data engineers, ADK enables autonomous pipeline orchestration, data research workflows, and intelligent monitoring systems deployed to Cloud Run or Agent Engine.

Python 3.9+ google-adk Gemini / Vertex AI Multi-Agent GCP-native MCP-compatible

Core Concepts
Industry Use Cases
Code Examples
Comparison Table
Gotchas & Pitfalls
Exercises
Quiz
Further Reading

Core Concepts

1. Agent Core

An ADK agent is a Python class inheriting from the appropriate base (LlmAgent for LLM-driven agents, BaseAgent for custom logic). Key constructor params: name, model (e.g. "gemini-2.0-flash"), instruction (system prompt), tools (list of callables or Tool objects), and sub_agents (for multi-agent architectures).

2. Tool Framework

Tools in ADK are plain Python functions with type-annotated signatures and docstrings — the framework auto-generates the JSON schema for the model. Tools can be wrapped from external libraries (LangChain tools, MCP servers). ADK supports three tool types:

Function Tools — annotated Python functions (most common)
Agent Tools — another agent used as a tool by a parent agent
Built-in Tools — Google Search, code execution (google_search, built_in_code_execution)

3. Memory Systems

ADK provides pluggable memory services:

Memory Type	Scope	Default Backend
Session State	Within a single conversation session	In-memory dict
Persistent Memory	Across sessions for a user	Vertex AI Memory Bank / Firestore
Shared Agent State	Across agents within a workflow	In-session state dictionary

4. Orchestration Workflows

ADK supports declarative multi-agent workflows via sub-agent composition:

SequentialAgent — runs sub-agents in order, passing state between them
ParallelAgent — runs sub-agents concurrently, merges results
LoopAgent — repeats a sub-agent until a termination condition is met
LlmAgent with sub_agents — the parent LLM dynamically decides which sub-agent to delegate to

5. Deployment Runtime

ADK ships a built-in dev web UI (adk web) for local testing and conversation replay. For production, agents deploy to:

Vertex AI Agent Engine — fully managed, scales to zero, native Gemini integration
Cloud Run — containerised, expose as HTTP endpoint
A2A Protocol — Agent-to-Agent protocol for inter-agent HTTP communication

6. Evaluation Framework

ADK includes a built-in evaluation module. Test cases are defined as JSON files with input, expected intermediate_steps (tool calls), and expected final_response. Run with adk eval <agent_module> <eval_set.json> to produce pass/fail metrics per test case.

↑ Back to top

Industry Use Cases

1. Autonomous Data Pipeline Orchestration

A SequentialAgent workflow: a Discovery agent queries BigQuery Information Schema for tables modified in the last 24 hours, a Quality agent runs dbt tests on changed models, and a Notification agent posts a summary to Slack. Triggered by Cloud Scheduler — fully serverless DE automation.

2. Multi-Agent Data Research

A root LlmAgent receives a business question ("Which markets underperformed last quarter?"), dynamically delegates to a SQL Agent (query warehouse), a Trends Agent (Google Search for market context), and a Synthesis Agent (write narrative). Combines structured and unstructured data analysis in one workflow.

3. Intelligent Data Catalog Assistant

An agent backed by Vertex AI Search over the company data catalog answers natural-language questions about dataset lineage, ownership, SLAs, and field definitions. Deployed to a corporate Slack bot via Cloud Run, reducing time-to-answer for data consumers.

4. Automated Incident Resolution

On a PagerDuty alert, a LoopAgent: reads the Airflow task failure → looks up the affected table definition → checks recent dbt test history → queries the error table → generates a fix PR via GitHub API → loops until the pipeline run succeeds. Closes low-complexity incidents without human intervention.

↑ Back to top

Code Examples

1. Simple LLM Agent with Function Tools

from google.adk.agents import LlmAgent
from google.adk.sessions import InMemorySessionService
from google.adk.runners import Runner
import duckdb

# --- Tools (plain annotated functions) ---
def query_warehouse(sql: str) -> dict:
    """Execute a read-only SQL query against the data warehouse.

    Args:
        sql: A valid SELECT statement.

    Returns:
        A dict with 'columns' and 'rows' keys.
    """
    conn = duckdb.connect("warehouse.ddb", read_only=True)
    df = conn.execute(sql).fetchdf()
    return {"columns": list(df.columns), "rows": df.head(20).to_dict("records")}

def list_tables() -> list[str]:
    """Return all table names in the warehouse."""
    conn = duckdb.connect("warehouse.ddb", read_only=True)
    return [r[0] for r in conn.execute("SHOW TABLES").fetchall()]

# --- Agent ---
warehouse_agent = LlmAgent(
    name="warehouse_agent",
    model="gemini-2.0-flash",
    instruction="""You are a data warehouse assistant.
Use the available tools to answer SQL questions accurately.
Always validate your SQL syntax before executing.""",
    tools=[query_warehouse, list_tables],
)

# --- Run ---
session_service = InMemorySessionService()
runner = Runner(
    agent=warehouse_agent,
    app_name="warehouse_assistant",
    session_service=session_service,
)

session = session_service.create_session(app_name="warehouse_assistant", user_id="user_1")

from google.genai.types import Content, Part
response = runner.run(
    user_id="user_1",
    session_id=session.id,
    new_message=Content(role="user", parts=[Part(text="Which tables exist and what is the row count of the orders table?")]),
)
for event in response:
    if event.is_final_response():
        print(event.content.parts[0].text)

2. Multi-Agent Sequential Pipeline

from google.adk.agents import LlmAgent, SequentialAgent

# Sub-agent 1: Fetch pipeline status
status_agent = LlmAgent(
    name="status_agent",
    model="gemini-2.0-flash",
    instruction="Check all Airflow DAG statuses and store failures in state['failures'].",
    tools=[check_airflow_status],  # custom tool
    output_key="failures",
)

# Sub-agent 2: Root cause analysis
rca_agent = LlmAgent(
    name="rca_agent",
    model="gemini-2.0-pro",
    instruction="""\
Given the pipeline failures in state['failures']:
1. Identify the root cause for each failure
2. Check related table schemas using the warehouse tool
3. Provide a structured diagnosis
Store results in state['diagnoses'].""",
    tools=[query_warehouse],
    output_key="diagnoses",
)

# Sub-agent 3: Generate report
report_agent = LlmAgent(
    name="report_agent",
    model="gemini-2.0-flash",
    instruction="Format state['diagnoses'] into a Markdown incident report with severity ratings.",
    output_key="report",
)

# Compose sequential workflow
pipeline_triage = SequentialAgent(
    name="pipeline_triage",
    sub_agents=[status_agent, rca_agent, report_agent],
)

# Run the full workflow
runner = Runner(agent=pipeline_triage, app_name="triage", session_service=session_service)
session = session_service.create_session(app_name="triage", user_id="oncall")
events = runner.run(
    user_id="oncall",
    session_id=session.id,
    new_message=Content(role="user", parts=[Part(text="Run daily pipeline health check")]),
)
for e in events:
    if e.is_final_response():
        print(e.content.parts[0].text)

3. Loop Agent — Retry Until Success

from google.adk.agents import LlmAgent, LoopAgent

def run_dbt_test(model_name: str) -> dict:
    """Run dbt tests for a model and return pass/fail status.
    
    Args:
        model_name: The dbt model name to test.
    
    Returns:
        Dict with 'status' ('passed'|'failed') and 'errors' list.
    """
    import subprocess, json
    result = subprocess.run(
        ["dbt", "test", "--select", model_name, "--output", "json"],
        capture_output=True, text=True
    )
    return {"status": "passed" if result.returncode == 0 else "failed", "output": result.stdout}

fixer_agent = LlmAgent(
    name="fixer",
    model="gemini-2.0-pro",
    instruction="""\
Run dbt tests for the model in state['model_name'].
If tests fail, analyse the errors and apply a fix via the patch_sql tool.
Set state['done'] = True when all tests pass.""",
    tools=[run_dbt_test, patch_sql_tool],
)

loop = LoopAgent(
    name="auto_fixer",
    sub_agents=[fixer_agent],
    max_iterations=5,
    termination_condition="state.get('done') == True",
)

4. Deploy to Vertex AI Agent Engine

# requirements: google-cloud-aiplatform[adk,agent_engines]
import vertexai
from vertexai.preview import reasoning_engines

vertexai.init(project="my-gcp-project", location="us-central1")

# Wrap ADK agent for Agent Engine
app = reasoning_engines.AdkApp(
    agent=warehouse_agent,
    enable_tracing=True,
)

# Deploy (takes ~3 min, creates a managed endpoint)
remote_app = reasoning_engines.ReasoningEngine.create(
    app,
    requirements=[
        "google-adk==1.0.0",
        "duckdb==1.1.0",
    ],
    display_name="Warehouse Assistant",
)

# Query the deployed agent
session = remote_app.create_session(user_id="prod_user")
result = remote_app.stream_query(
    user_id="prod_user",
    session_id=session["id"],
    message="How many orders were placed yesterday?",
)

↑ Back to top

Comparison Table

Framework	Primary Backing	Multi-Agent	GCP Integration	Open Source	Best For
Google ADK	Google / Gemini	Native (Sequential/Parallel/Loop)	Native	Yes	GCP-native agent workflows
CrewAI	Independent	Native (crew metaphor)	Via LangChain tools	Yes	Role-based team agents
LangChain	LangChain Inc.	Via LangGraph	Via integrations	Yes	Composable pipelines, RAG
Azure AI Agents	Microsoft / OpenAI	Via AutoGen	Azure-native	Partial	Azure-native deployments
AWS Bedrock Agents	AWS / Anthropic	Multi-agent collaboration	AWS-native	No	AWS-native deployments

↑ Back to top

Gotchas & Pitfalls

Tool docstring contract: ADK auto-generates JSON schemas from function signatures and docstrings. Missing or vague docstrings cause the model to misuse tools. Every parameter must have an Args: section with a clear description — treat docstrings as the API contract.
Session state serialisation: State values must be JSON-serialisable. DataFrames, NumPy arrays, or custom objects stored in state silently fail or corrupt. Convert to primitives (dicts, lists, strings) before storing.
LoopAgent infinite loops: If termination_condition never becomes True, the loop runs to max_iterations and exits with no clear error. Always initialise the termination state key and add a fallback condition after max iteration.
Gemini model version churn: ADK is tied to Gemini; model names (gemini-pro, gemini-1.5-pro, gemini-2.0-flash) change rapidly. Pin model versions in production and test upgrades in staging before promoting.
Agent Engine cold starts: Vertex AI Agent Engine can have 5–15 second cold starts. For latency-sensitive applications, use Cloud Run with min-instances=1 instead, or implement request warming.

↑ Back to top

Exercises

Warehouse Q&A Agent: Build and locally run a LlmAgent with tools for list_tables(), describe_table(table_name), and query_warehouse(sql) backed by a real DuckDB file. Evaluate it against 10 test questions using the ADK eval framework (adk eval).
Parallel Research Agent: Create a ParallelAgent with three sub-agents that simultaneously search for: (a) BigQuery pricing updates, (b) Snowflake pricing updates, (c) Databricks pricing updates. A final LlmAgent synthesises all three into a cost-comparison table.
Deploy to Agent Engine: Take the warehouse Q&A agent, deploy it to Vertex AI Agent Engine using the AdkApp wrapper, and run at least three queries against the remote endpoint. Capture traces in Cloud Trace and identify the slowest tool call.

↑ Back to top

Quiz

How does ADK convert a Python function into a tool schema for the Gemini model?
Answer: ADK inspects the function's type annotations and docstring (specifically the Args: section) to automatically generate a JSON Schema compatible with the Gemini function-calling protocol. No separate schema definition is needed.
What is the difference between SequentialAgent and LlmAgent with sub_agents?
Answer: SequentialAgent runs sub-agents in a fixed declared order. LlmAgent with sub_agents lets the parent LLM dynamically decide which sub-agent to invoke based on the current conversation context — providing flexible delegation at the cost of predictability.
Why must tool function parameters be JSON-serialisable?
Answer: The Gemini model sends tool arguments as JSON. ADK deserialises them from JSON into Python types. If a tool expects a type not representable in JSON (e.g. a datetime object), the call will fail. Always use primitives (str, int, float, bool, list, dict) in tool signatures.
What is the output_key parameter on an LlmAgent used for in a SequentialAgent pipeline?
Answer: It specifies the key under which the agent's final text response is stored in the shared session state dict. Downstream agents can then read it from state, enabling structured data flow without needing direct agent-to-agent communication.
What deployment option should you choose for a low-latency, always-on ADK agent vs an infrequent batch agent?
Answer: For low-latency always-on: Cloud Run with min-instances=1 to avoid cold starts. For infrequent/batch: Vertex AI Agent Engine (scales to zero, lower cost, built-in session management) accepting higher cold start latency.

↑ Back to top

Google ADK

Table of Contents

Core Concepts

1. Agent Core

2. Tool Framework

3. Memory Systems

4. Orchestration Workflows

5. Deployment Runtime

6. Evaluation Framework

Industry Use Cases

1. Autonomous Data Pipeline Orchestration

2. Multi-Agent Data Research

3. Intelligent Data Catalog Assistant

4. Automated Incident Resolution

Code Examples

1. Simple LLM Agent with Function Tools

2. Multi-Agent Sequential Pipeline

3. Loop Agent — Retry Until Success

4. Deploy to Vertex AI Agent Engine

Comparison Table

Gotchas & Pitfalls

Exercises

Quiz

Further Reading