Reflection

The Reflection pattern involves an agent evaluating its own work, output, or internal state and using that evaluation to improve its performance or refine its response. It's a form of self-correction or self-improvement, allowing the agent to iteratively refine its output or adjust its approach based on feedback, internal critique, or comparison against desired criteria. Reflection can occasionally be facilitated by a separate agent whose specific role is to analyze the output of an initial agent.

The process typically involves:

Execution: The agent performs a task or generates an initial output.
Evaluation/Critique: The agent (often using another LLM call or a set of rules) analyzes the result from the previous step. This evaluation might check for factual accuracy, coherence, style, completeness, adherence to instructions, or other relevant criteria.
Reflection/Refinement: Based on the critique, the agent determines how to improve. This might involve generating a refined output, adjusting parameters for a subsequent step, or even modifying the overall plan.
Iteration (Optional but common): The refined output or adjusted approach can then be executed, and the reflection process can repeat until a satisfactory result is achieved or a stopping condition is met.

A key and highly effective implementation of the Reflection pattern separates the process into two distinct logical roles: a Producer and a Critic. This is often called the "Generator-Critic" or "Producer-Reviewer" model.

The Producer Agent: This agent's primary responsibility is to perform the initial execution of the task. It focuses entirely on generating the content, whether it's writing code, drafting a blog post, or creating a plan. It takes the initial prompt and produces the first version of the output.
The Critic Agent: This agent's sole purpose is to evaluate the output generated by the Producer. It is given a different set of instructions, often a distinct persona (e.g., "You are a senior software engineer," "You are a meticulous fact-checker"). The Critic's instructions guide it to analyze the Producer's work against specific criteria, such as factual accuracy, code quality, stylistic requirements, or completeness. It is designed to find flaws, suggest improvements, and provide structured feedback.

Practical Applications & Use Cases

Creative Writing and Content Generation
Code Generation and Debugging
Complex Problem Solving
Summarization and Information Synthesis
Planning and Strategy
Conversational Agents

Summary

What: An agent's initial output is often suboptimal, suffering from inaccuracies, incompleteness, or a failure to meet complex requirements. Basic agentic workflows lack a built-in process for the agent to recognize and fix its own errors. This is solved by having the agent evaluate its own work or, more robustly, by introducing a separate logical agent to act as a critic, preventing the initial response from being the final one regardless of quality.

Why: The Reflection pattern offers a solution by introducing a mechanism for self-correction and refinement. It establishes a feedback loop where a "producer" agent generates an output, and then a "critic" agent (or the producer itself) evaluates it against predefined criteria. This critique is then used to generate an improved version. This iterative process of generation, evaluation, and refinement progressively enhances the quality of the final result, leading to more accurate, coherent, and reliable outcomes.

Rule of Thumb: Use the Reflection pattern when the quality, accuracy, and detail of the final output are more important than speed and cost. It is particularly effective for tasks like generating polished long-form content, writing and debugging code, and creating detailed plans. Employ a separate critic agent when tasks require high objectivity or specialized evaluation that a generalist producer agent might miss.

Key Takeaways

The primary advantage of the Reflection pattern is its ability to iteratively self-correct and refine outputs, leading to significantly higher quality, accuracy, and adherence to complex instructions.
It involves a feedback loop of execution, evaluation/critique, and refinement. Reflection is essential for tasks requiring high-quality, accurate, or nuanced outputs.
A powerful implementation is the Producer-Critic model, where a separate agent (or prompted role) evaluates the initial output. This separation of concerns enhances objectivity and allows for more specialized, structured feedback.
However, these benefits come at the cost of increased latency and computational expense, along with a higher risk of exceeding the model's context window or being throttled by API services.
While full iterative reflection often requires stateful workflows (like LangGraph), a single reflection step can be implemented in LangChain using LCEL to pass output for critique and subsequent refinement.
Google ADK can facilitate reflection through sequential workflows where one agent's output is critiqued by another agent, allowing for subsequent refinement steps.
This pattern enables agents to perform self-correction and enhance their performance over time.

Code Examples

LangChain Implementation

reflection_langchain.py

import os
from dotenv import load_dotenv
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.messages import SystemMessage, HumanMessage

# --- Configuration ---
# Load environment variables from .env file (for OPENAI_API_KEY)
load_dotenv()

# Check if the API key is set
if not os.getenv("OPENAI_API_KEY"):
    raise ValueError("OPENAI_API KEY not found in .env file. Please add it.")

MODEL_ID = os.getenv("MODEL_ID")

# Initialize the Chat LLM. We use gpt-4o for better reasoning.
# A lower temperature is used for more deterministic outputs.
llm = ChatOpenAI(model=MODEL_ID, temperature=0.1)

def run_reflection_loop():
    """
    Demonstrates a multi-step AI reflection loop to progressively
    improve a Python function.
    """

    # --- The Core Task ---
    task_prompt = """
Your task is to create a Python function named
`calculate_factorial`.
This function should do the following:
  1.  Accept a single integer `n` as input.
  2.  Calculate its factorial (n!).
  3.  Include a clear docstring explaining what the function does.
  4.  Handle edge cases: The factorial of 0 is 1.
  5.  Handle invalid input: Raise a ValueError if the input is a
negative number.
"""

    # --- The Reflection Loop ---
    max_iterations = 3
    current_code = ""
    # We will build a conversation history to provide context in each step.
    message_history = [HumanMessage(content=task_prompt)]

    for i in range(max_iterations):
        print("\n" + "=" * 25 + f" REFLECTION LOOP: ITERATION {i + 1} " + "=" * 25)

        # --- 1. GENERATE / REFINE STAGE ---
        # In the first iteration, it generates. In subsequent
        # iterations, it refines.
        if i == 0:
            print("\n>>> STAGE 1: GENERATING initial code...")
            # The first message is just the task prompt.
            response = llm.invoke([HumanMessage(content="Implement `calculate_factorial` function, a function to compute the factorial of a number recursively.")])
            current_code = response.content
        else:
            print("\n>>> STAGE 1: REFINING code based on previous critique...")
            # The message history now contains the task,
            # the last code, and the last critique.
            # We instruct the model to apply the critiques.
            message_history.append(
                HumanMessage(content="Please refine the code using the critiques provided.")
            )
            response = llm.invoke(message_history)
            current_code = response.content

        print("\n--- Generated Code (v" + str(i + 1) + ") ---\n" + current_code)
        message_history.append(response)  # Add the generated code to history

        # --- 2. REFLECT STAGE ---
        print("\n>>> STAGE 2: REFLECTING on the generated code...")

        # Create a specific prompt for the reflector agent.
        # This asks the model to act as a senior code reviewer.
        reflector_prompt = [
            SystemMessage(
                content="""
You are a senior software engineer and an expert
in Python.
Your role is to perform a meticulous code review.
Critically evaluate the provided Python code based
on the original task requirements.
Look for bugs, style issues, missing edge cases,
and areas for improvement.
If the code is perfect and meets all requirements,
respond with the single phrase 'CODE_IS_PERFECT'.
Otherwise, provide a bulleted list of your critiques.
"""
            ),
            HumanMessage(
                content=f"Original Task:\n{task_prompt}\n\nCode to Review:\n{current_code}"
            ),
        ]

        critique_response = llm.invoke(reflector_prompt)
        critique = critique_response.content

        # --- 3. STOPPING CONDITION ---
        if "CODE_IS_PERFECT" in critique:
            print("\n--- Critique ---\nNo further critiques found. The code is satisfactory.")
            break

        print("\n--- Critique ---\n" + critique)
        # Add the critique to the history for the next refinement loop.
        message_history.append(HumanMessage(content=f"Critique of the previous code:\n{critique}"))

    print("\n" + "=" * 30 + " FINAL RESULT " + "=" * 30)
    print("\nFinal refined code after the reflection process:\n")
    print(current_code)


if __name__ == "__main__":
    run_reflection_loop()

Google ADK Implementation

reflection_adk.py

"""
ADK Generator–Critic–Refiner Loop (with terminal condition) using LoopAgent

What you get:
- DraftWriter: produces an initial draft once (stored in state["draft_text"])
- Critic: reviews the draft and either:
    - outputs EXACT completion phrase (meaning “done”), OR
    - outputs actionable critique (stored in state["critique"])
- Refiner: if done -> calls exit_loop() to terminate the LoopAgent
          else -> applies critique and overwrites state["draft_text"]
- LoopAgent: runs [Critic, Refiner] repeatedly up to max_iterations or until exit_loop() escalates.

How to run:
1) pip install python-dotenv google-adk google-genai   (package names may vary in your env)
2) .env in same folder:
      GOOGLE_API_KEY=...
      GOOGLE_MODEL_ID=gemini-2.0-flash
3) python adk_loop_reflection.py
"""

import asyncio
import os
import uuid

from dotenv import load_dotenv

from google.adk.agents import LoopAgent, LlmAgent, SequentialAgent
from google.adk.runners import Runner
from google.adk.sessions.in_memory_session_service import InMemorySessionService
from google.adk.tools.tool_context import ToolContext
from google.genai.types import Content, Part

# ----------------------------
# Configuration
# ----------------------------
load_dotenv()

MODEL_ID = os.getenv("GOOGLE_MODEL_ID")
if not os.getenv("GOOGLE_API_KEY"):
    raise RuntimeError("Missing GOOGLE_API_KEY. Put it in .env or export it.")

APP_NAME = "adk_loop_reflection_demo"
USER_ID = "saint1729"

# State keys
STATE_DRAFT = "draft_text"
STATE_CRITIQUE = "review_output"

# Terminal condition phrase (Critic must output EXACTLY this when draft is good)
COMPLETION_PHRASE = "No major issues found."


# ----------------------------
# Tool: exit loop
# ----------------------------
def exit_loop(tool_context: ToolContext):
    """
    Call this ONLY when the Critic indicates completion.
    Setting `tool_context.actions.escalate = True` tells ADK to stop the loop.
    """
    print(f"[Tool Call] exit_loop() triggered by agent={tool_context.agent_name}")
    tool_context.actions.escalate = True
    tool_context.actions.skip_summarization = True
    return {}  # Tools should return JSON-serializable output


# ----------------------------
# 1) Initial Generator (runs once)
# ----------------------------
draft_writer = LlmAgent(
    name="DraftWriter",
    model=MODEL_ID,
    include_contents="none",
    description="Generates the initial draft content on a given subject.",
    instruction="""
You are a concise technical writer.
Write a short, informative paragraph about the user's subject (3–5 sentences).
Do not use headings or bullet points.
Output ONLY the paragraph text.
""".strip(),
    output_key=STATE_DRAFT,
)

# ----------------------------
# 2a) Critic (runs inside the loop)
# ----------------------------
critic = LlmAgent(
    name="Critic",
    model=MODEL_ID,
    include_contents="none",
    description="Reviews the draft and either approves it or provides actionable critique.",
    instruction=f"""
You are a meticulous reviewer.

Draft to review:
{{{STATE_DRAFT}}}


Completion criteria:
1) The paragraph is coherent and informative for the given subject.
2) No obvious factual errors or overconfident claims.
3) No fluff; clear and concise writing.

Task:
- If ALL criteria are met, respond EXACTLY with: "{COMPLETION_PHRASE}"
- Otherwise, respond with a concise bullet list of specific improvements.

Output ONLY your critique text (or the exact completion phrase).
""".strip(),
    output_key=STATE_CRITIQUE,
)

# ----------------------------
# 2b) Stopping tool agent (runs inside the loop, checks for completion)
# ----------------------------
stopper = LlmAgent(
    name="Stopper",
    model=MODEL_ID,
    include_contents="none",
    tools=[exit_loop],
    description="Stops the loop when the critic approves.",
    instruction=f"""
Critique:
{{{STATE_CRITIQUE}}}

Rule:
- If the critique is EXACTLY "{COMPLETION_PHRASE}", call `exit_loop()`.
- Otherwise, output ONLY the single word: CONTINUE.
""".strip(),
    # IMPORTANT: no output_key here (so it can't overwrite draft_text)
)

# ----------------------------
# 2c) Refiner (runs inside the loop)
# ----------------------------
refiner = LlmAgent(
    name="Refiner",
    model=MODEL_ID,
    include_contents="none",
    description="Refines the draft based on critique, or exits the loop if complete.",
    tools=[exit_loop],
    instruction=f"""
You refine drafts.

Current draft:
{{{STATE_DRAFT}}}

Critique:
{{{STATE_CRITIQUE}}}

Apply the critique and output ONLY the refined paragraph (3–5 sentences).
""".strip(),
    # Overwrite the draft in state on each refinement iteration
    output_key=STATE_DRAFT,
)

# ----------------------------
# 3) LoopAgent: Critic -> Refiner until exit_loop or max_iterations
# ----------------------------
reflection_loop = LoopAgent(
    name="DraftReviewLoop",
    sub_agents=[critic, stopper, refiner],  # order matters: critique first, then stop, then refine/exit
    max_iterations=5,
)

# ----------------------------
# 4) Full pipeline: initial draft once, then loop until done
# ----------------------------
root_agent = SequentialAgent(
    name="WriteReviewRefinePipeline",
    sub_agents=[draft_writer, reflection_loop],
    description="Writes an initial draft, then iteratively critiques/refines until acceptable.",
)


# ----------------------------
# Runnable entrypoint
# ----------------------------
async def run_once(subject: str):
    session_service = InMemorySessionService()
    runner = Runner(agent=root_agent, app_name=APP_NAME, session_service=session_service)

    session_id = f"session_{uuid.uuid4().hex[:8]}"
    await session_service.create_session(app_name=APP_NAME, user_id=USER_ID, session_id=session_id)

    # The user's message becomes “the subject” the DraftWriter writes about.
    new_message = Content(role="user", parts=[Part(text=subject)])

    async for event in runner.run_async(
        user_id=USER_ID,
        session_id=session_id,
        new_message=new_message,
    ):
        # Uncomment to see event stream:
        # print(event)
        pass

    # Inspect final state
    session = await session_service.get_session(app_name=APP_NAME, user_id=USER_ID, session_id=session_id)
    state = session.state or {}

    return {
        "final_draft": state.get(STATE_DRAFT, ""),
        "last_critique": state.get(STATE_CRITIQUE, ""),
    }


def main():
    subject = "The James Webb Space Telescope (JWST)"
    result = asyncio.run(run_once(subject))

    print("\n========== FINAL DRAFT ==========\n")
    print(result["final_draft"])

    print("\n========== LAST CRITIQUE ==========\n")
    print(result["last_critique"])


if __name__ == "__main__":
    main()

Parallelization

Tool Use