Posts/the philosophy of agent frameworks

the philosophy of agent frameworks

Every agent framework is just a disagreement about how much to trust the model. Here's how to actually think about the landscape.

TL;DR: An agent is an LLM in a loop with tools. Every framework disagrees about how much of that loop to hard-code vs. let the model decide. The landscape went from chains (2022) to stateful graphs (2024) to model-driven harnesses (2025). MCP won as the integration standard. Pick your framework based on how much you trust the model and how much you need to debug.

LangChain, LangGraph, n8n, Agno, and the broader agent landscape in 2026.

table of contents

  1. what "agent" actually means
  2. how we got here
  3. the core mental models
  4. the frameworks, one by one
  5. the protocol layer
  6. where things are right now
  7. how to actually choose

what "agent" actually means in this conversation

Before any framework makes sense, you have to settle the word. An agent is an LLM running in a loop with access to tools, deciding for itself what to do next. The minimum viable agent is the ReAct pattern from 2022: model produces a thought, picks a tool, sees the tool result, picks the next tool, and so on until it decides it's done. That is the whole idea. Everything else, every framework, every architecture diagram, every "multi-agent system," is disagreement about how much of that loop you should hard-code versus how much you should let the model decide.

That single axis, how much control the developer keeps versus how much the developer cedes to the model, is the most useful lens for understanding the space. On one end, you have rigid pipelines where the developer writes every step and the LLM is just a smart string transformer at each node. On the other end, you have a single big "do the thing" call where the model is trusted to plan, decompose, retry, and finish on its own. Every framework you've heard of sits somewhere on that axis, and most have moved along it as models got better. When a smarter model can replace the framework, the team behind it has to keep redefining what value they add.

Control spectrum of agent architectures Five architectural categories arranged from developer-controlled flow on the left to model-controlled flow on the right. Developer drives flow Model drives flow Chains Sequential Graphs With cycles Teams Role-based Dialogue Multi-agent Harness LLM-driven

how we got here

The pre-history is short and worth knowing. Before late 2022, "agents" mostly meant reinforcement learning agents in research labs, which is a different field. The current usage of the word starts with the ReAct paper (Yao et al., late 2022) and explodes in early 2023 with AutoGPT and BabyAGI, neither of which were good but both of which planted the meme that you could let GPT-4 loop on itself with tools and have it do real things. Most of those early systems failed the same way: they would get into loops, lose the plot, or burn through tokens chasing a hallucinated subgoal.

LangChain shows up in this period as the first widely used Python library that gave you primitives for chaining LLM calls together, plugging in tools, doing retrieval, and so on. It became the dominant library because it was the easiest way to go from "I have an OpenAI key" to "I have a working RAG system." Senior engineers criticized it for hiding prompts behind layers of abstraction. By mid-2023 many had concluded that calling the API direct was simpler than wrestling with LangChain's wrappers. The team rewrote it in 2025 to be leaner.

The second wave, late 2023 through 2024, was the realization that real agents need state, not just chains. A chain is acyclic: input goes in, output comes out, no loops, no memory between runs. A real agent needs to be able to revisit a step, retry on failure, wait for a human, persist conversation history, and maintain typed state across many tool calls. Chains can't do this cleanly. LangGraph was the LangChain team's answer. CrewAI took a different bet: instead of explicit state graphs, model agents as specialists in a team. AutoGen, from Microsoft Research, took a third bet: model the system as a multi-turn conversation between agents.

The third wave, 2025 into 2026, is the one we're in now. Once Claude 3.5/4 and GPT-4o/5 became reliable enough at tool use that you could give them tools and let them loop, Anthropic, OpenAI, AWS, and LangChain each shipped "agent harness" frameworks that don't try to control the loop at all: Claude Agent SDK, Agents SDK (replacing Swarm), Strands Agents, and deepagents. All four bet that the model is now good enough that you should hand it the keys and focus on what tools and memory it has access to. At the same time, the protocol layer matured: MCP, introduced by Anthropic in November 2024, became the closest thing the field has to a universal standard.

Three generations of agent frameworks Evolution from chain-based agents in 2022-23 through stateful agents in 2024 to model-driven harnesses in 2025-26. Origins 2022 - 2023 ReAct paper AutoGPT, BabyAGI LangChain v0 Stateful agents 2024 LangGraph CrewAI, AutoGen n8n adds AI MCP launched Agent harnesses 2025 - 2026 OpenAI Agents SDK Claude Agent SDK Strands, Google ADK Agno, deepagents

the core mental models

A handful of architectural metaphors cover the entire space.

The first is the chain or pipeline. You wire up a directed acyclic graph of LLM calls and other operations, and data flows through. The developer specifies the structure, and the LLM is just the smart bit at each step. This is what early LangChain was. It works for fixed pipelines but doesn't qualify as agentic.

The second is the stateful graph or state machine. Same idea as a chain, but you allow cycles, conditional edges, and explicit shared state that flows through the graph and gets updated at each node. Now the LLM can decide, at certain nodes, which edge to take next. The developer still defines the topology, but the LLM steers within it. LangGraph is the example. This is the dominant mental model for production systems where you need to know exactly what your agent did and why.

The third is the role-based team. You define agents as specialists with roles, goals, backstories, and tools, and you give them tasks. The framework handles delegation, communication, and result aggregation. You don't think about graphs; you think about a marketing team or a research crew. CrewAI is the example. This makes the common case fast to prototype and the uncommon case painful, because the abstraction is rigid.

The fourth is the multi-agent conversation. Multiple agents, each with their own system prompt and tools, interact through a shared dialogue. An orchestrator decides who speaks next. AutoGen (now AG2) is built on this. It's good for problems where you want emergent behavior from agents debating, refining each other's outputs, or specializing through dialogue, and most agentic research happens here.

The fifth is the agent harness or tool-calling loop. A harness here means a thin wrapper around the model: it handles the plumbing (streaming, retries, tool dispatch) but makes zero decisions about what the agent does next. There is no graph, no team, no conversation: just one LLM in a loop, with a curated set of tools and a long-context memory, trusted to drive. The framework makes that loop production-grade: streaming, persistence, observability, sub-agents on demand, file-system-as-memory, planning guides. Claude Agent SDK, OpenAI Agents SDK, deepagents, and Strands all live here. This is the bet on capability: as models improve, less scaffolding is needed.

The sixth is the visual workflow builder. You drag nodes onto a canvas, connect them with arrows, and the canvas itself is the program. AI agents are just one type of node alongside HTTP calls, database queries, conditionals, and so on. n8n, Make, Zapier, and similar tools live here. The philosophy is that most real-world automations are mostly deterministic with AI sprinkled in, and a visual interface meets that reality better than code.

These are not mutually exclusive. LangGraph can run inside an n8n node. An OpenAI Agents SDK agent can be a sub-agent in a CrewAI crew.


one task, three frameworks

The mental models above are easier to feel when you see the same task implemented three ways. Take a simple agent that answers questions by searching the web and then synthesizing a response. Here it is in LangGraph, Agno, and the Claude Agent SDK.

LangGraph: you draw the graph

from langgraph.graph import StateGraph, END
from langchain_anthropic import ChatAnthropic
from typing import TypedDict
 
class State(TypedDict):
    question: str
    search_results: str
    answer: str
 
llm = ChatAnthropic(model="claude-sonnet-4-20250514")
 
def search(state: State) -> State:
    # call your search tool, store results in state
    results = web_search(state["question"])
    return {"search_results": results}
 
def synthesize(state: State) -> State:
    prompt = f"Answer based on: {state['search_results']}\nQ: {state['question']}"
    answer = llm.invoke(prompt).content
    return {"answer": answer}
 
def should_search(state: State) -> str:
    return "search" if needs_search(state["question"]) else "synthesize"
 
graph = StateGraph(State)
graph.add_node("search", search)
graph.add_node("synthesize", synthesize)
graph.set_conditional_entry_point(should_search)
graph.add_edge("search", "synthesize")
graph.add_edge("synthesize", END)
agent = graph.compile()
 
result = agent.invoke({"question": "What is MCP?"})

You defined the topology. You chose the edges. The LLM fills in the blanks at each node, but it never decides which node to visit next - that's your conditional function. If something goes wrong, you look at the state at each transition and see exactly where it derailed.

Agno: you hand it tools

from agno.agent import Agent
from agno.models.anthropic import Claude
from agno.tools.duckduckgo import DuckDuckGoTools
 
agent = Agent(
    model=Claude(id="claude-sonnet-4-20250514"),
    tools=[DuckDuckGoTools()],
    instructions="Answer questions. Search the web if you need current info.",
    markdown=True,
)
 
agent.print_response("What is MCP?")

Six lines. No graph, no state class, no edges. The model decides whether to search. Agno wraps the tool-calling loop and handles the back-and-forth internally. You trade visibility for speed of development.

Claude Agent SDK: you trust the model

import anthropic
 
client = anthropic.Anthropic()
tools = [web_search_tool]  # MCP-compatible tool definitions
 
response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    system="Answer questions. Search if needed. Think step by step.",
    tools=tools,
    messages=[{"role": "user", "content": "What is MCP?"}],
)
 
# The SDK handles the tool-use loop:
# model calls search -> gets results -> synthesizes answer

The model does everything. You provide tools and a system prompt. The SDK runs the tool-calling loop until the model decides it's done. This is the harness philosophy: the model is the control flow.

The same task, three levels of developer control. LangGraph: you hard-code the flow. Agno: you configure the agent and let it loop. Claude SDK: you hand over the keys. Each is the right choice in different contexts. The question is always how much you trust the model and how much you need to debug.

the frameworks, one by one

LangChain

LangChain started as a chain library, became an ecosystem, and is currently rebranding as "the agent engineering platform." The original 2023 LangChain was a giant collection of integrations (every vector store, every LLM, every tool) wrapped in a set of abstractions for chaining LLM calls. It was, in retrospect, too opinionated. The team rewrote it in 2025 to be more streamlined, and the modern langchain package is a much leaner integration layer on top of langgraph's runtime.

The mistake people make about LangChain is treating it as a single thing. Think of it as three layers stacked. At the bottom is langchain-core, which gives you common abstractions for messages, models, tools, and runnables. In the middle is langchain itself, the integration layer with hundreds of model providers, vector stores, and tools. On top sits langgraph for actual agent control flow, langsmith for observability, and langserve for deployment. When people say "LangChain is bloated," they usually mean the integration layer. When people say "LangChain is the most powerful framework," they usually mean the whole stack including LangGraph and LangSmith. Both are true.

The current value proposition is breadth. If you need to talk to 100 different models, 50 different vector stores, and 200 different APIs, nobody else has anything close. The cost is that every abstraction you adopt is one more layer between you and the actual prompt being sent. For a senior engineer, that's often a worse tradeoff than just calling the model API directly and writing your own thin wrapper.


LangGraph

LangGraph is the part of the LangChain world that most production teams care about. The model is simple and worth understanding precisely: your agent is a directed graph; nodes are functions (which can call LLMs, tools, or anything else); edges are control flow (which can be conditional based on state); and a single typed State object flows through the graph and is updated by nodes. Cycles are allowed, so an agent can loop. Checkpointing is built in, so every state transition is persisted, which gives you time-travel debugging, human-in-the-loop pauses, and crash recovery.

LangGraph won as the production default because it makes the control flow explicit. When something goes wrong in production, you can look at the graph, look at the state at each transition, and reason about what the agent did. With a "trust the model" harness, you have to read traces and try to figure out why the model made a given decision, which is harder. LangGraph's bet is that for high-stakes systems, the cost of explicitness is lower than the cost of opacity.

The downside is that explicitness. You write more code than you would in CrewAI or Agno for the same prototype. The learning curve is real. And as models improve, some of the structure you encoded in your graph becomes structure the model could have figured out on its own, meaning you've over-engineered. The LangChain team's answer is deepagents, which is a higher-level abstraction built on top of langgraph's runtime that gives you the "harness" feel without giving up the underlying durability and statefulness. It's LangChain's equivalent to Claude Agent SDK.

If you only learn one framework for production work, LangGraph is the choice today. It is model-agnostic & open-sourced.


n8n

n8n is the odd one in the list, because it is not an agent framework. It is a general-purpose workflow automation platform, like Zapier or Make, that has added AI agent nodes. The philosophy is the inverse of code-first frameworks: most real-world automations are deterministic plumbing with AI in a few key places, and a visual builder is the right abstraction for that reality.

The architecture matters. n8n is a TypeScript application, and its AI agent nodes are built on LangChain.js underneath. So when you drop an "AI Agent" node onto an n8n canvas and wire up a chat model, a memory, and some tools, you are configuring a LangChain agent through a GUI. This is why n8n's own internal AI Workflow Builder, the feature that generates workflows from natural language prompts, is itself a LangGraph multi-agent system under the hood. The visual layer sits on top of code-first frameworks; it doesn't replace them.

The right way to think about n8n is as a hybrid runtime. You get deterministic nodes (HTTP calls, database queries, scheduling, branching, error handling) for the boring parts of any automation, and you get AI agent nodes for the parts that need a model. The model isn't doing everything; it's doing the part that has to be smart, while the deterministic graph around it handles auth, retries, error paths, and integration.

It also means n8n's sweet spot is the workflows where you need both reliability and AI: the right move is often to keep the heavy reasoning inside a LangGraph or Agno agent and call it from n8n as a single node.


Agno

Agno (formerly Phi Data, rebranded in 2024) is a reaction against what its authors saw as overengineering in LangChain and LangGraph. Its pitch is "pure Python, no graphs, no chains, just agents that work and run fast."

The Agent class encapsulates the entire reasoning loop in a single object. You construct it with a model, tools, memory, knowledge sources, and storage; you call it; it runs the loop and returns a result. There is no graph to define, no edges to wire. For multi-agent systems, Agno gives you Teams (with four coordination modes: route, coordinate, collaborate, and a couple of variants) and Workflows for sequenced execution. Each team member can itself be an agent or a sub-team, so you get composition for free.

The technical bet that distinguishes it is performance and statelessness. Agno claims agent instantiation in around 3 microseconds and a tiny memory footprint, achieved by treating agents as lightweight, stateless, session-scoped objects rather than long-lived stateful processes. This is the right design for horizontal scaling: spin up an agent per request, do the work, throw it away. State lives in the storage layer (Postgres, SQLite, Mongo, vector stores) rather than in the agent. This is a different model from LangGraph, where agents are graph runtimes that maintain state internally and rely on checkpointing for durability.

AgentOS, the runtime layer, exposes agents, teams, and workflows as REST endpoints with built-in OpenAPI documentation, session management, streaming, and observability hooks. You write an Agent in Python; AgentOS gives you a deployable service. There's also an Agno-Go port that brings the same design to Go for teams who need real concurrency.

Where Agno makes sense over LangGraph is when you want the developer experience of pure Python, you don't need the visual graph abstraction, you care about cold-start latency and per-agent cost, and the integration ecosystem you need is already covered (it has 100+ integrations and supports MCP). Where LangGraph still wins is when your control flow is genuinely complex and you want it visible, when you need the durability guarantees of explicit checkpointing, or when you're already in the LangChain ecosystem and the migration cost is high.


CrewAI

CrewAI's single big idea is that multi-agent problems map onto teams of human specialists, so just code that metaphor directly. You define agents as specialists with a role ("Senior Researcher") and a set of tools. You define tasks. You assemble them into a Crew. You run it. The framework handles delegation, agent-to-agent context passing, and final aggregation.

This is fast for the common case and frustrating for the uncommon case. If your problem fits the metaphor (research, writing, analysis pipelines, content generation, anything you might describe as "a small team of people each doing their part"), you're in production in an afternoon. If your problem doesn't fit, the framework gets in your way. CrewAI's recent versions have added MCP support.

It's the right pick for fast prototyping of business agents and for teams who want to ship without learning graph theory.


AutoGen and AG2

AutoGen came out of Microsoft Research and was always more research-oriented than production-oriented. The original v0.2 introduced the idea of agents talking to each other in a multi-turn conversation, with agents debating and refining each other's outputs. The v0.4 rewrite, now branded AG2, rearchitected the system to be event-driven, async-first, and with pluggable orchestration strategies. The key abstraction is GroupChat: multiple agents in a shared conversation, with a selector function that decides who speaks next.

AutoGen's bet is that emergent intelligence comes from agent dialogue. Agents specialize through their roles, the conversation history is the shared state, and the selector is the control flow. This is a different bet from CrewAI's task-based delegation: in AutoGen, the conversation is the orchestration. AutoGen Studio gives you a low-code interface on top, but most serious AutoGen users write code.

It remains the framework of choice for research-style multi-agent experiments, code-generation systems where you want a coder agent and a critic agent going back and forth, and anywhere you want maximum flexibility for orchestration patterns. In production it requires more DIY infrastructure than LangGraph or Agno.


OpenAI Agents SDK

OpenAI's Agents SDK launched in March 2025 as a production-grade replacement for their experimental Swarm framework. The key abstraction is the handoff: agents transfer control to each other explicitly, carrying conversation context through the transition. Each agent declares its instructions, model, tools, and the list of agents it can hand off to. The runtime handles the routing.

This is a minimalist framework. Compared to LangGraph's typed-state graphs or CrewAI's role-based crews, the Agents SDK gives you few primitives: agents, handoffs, tools, guardrails. The bet is that good models turn simple primitives into rich behavior, and the framework should stay out of the way. It's tightly integrated with OpenAI's models (though it supports other providers), and it ships with good tracing and observability out of the box.

It's a strong choice if you're already on OpenAI infrastructure. It's not a strong choice if you need cross-provider portability.


Claude Agent SDK and deepagents

Claude Agent SDK (Anthropic, 2025) and LangChain's deepagents are the clearest examples of the agent-harness philosophy. The frame is: don't try to encode the control flow at all. Instead, give the agent a strong system prompt, a curated set of tools, a file system as scratch memory, and a planning structure. Then run the LLM in a loop and let it drive. Sub-agents are spawned on demand by the main agent, not pre-wired by the developer.

This works now because Claude 4 and GPT-5 are far better at long-horizon planning, tool selection, and self-correction than their predecessors. The framework's job shifted from "prop the model into reliability" to "remove obstacles so the model can be reliable on its own." File-system-as-memory, in particular, is a clever trick: instead of stuffing everything into the context window, the agent reads and writes files, which lets it work over arbitrarily long horizons without context pressure.

This is the most "bullish on models" position in the field, and it's the right philosophy for agent tasks where you don't know what control flow is needed in advance: open-ended research, code generation, complex debugging. It's the wrong philosophy when you do know the control flow, because hard-coding it is cheaper, more debuggable, and more reliable.


the rest of the landscape

These frameworks matter but don't need the long-form treatment. Each occupies a clear niche:

FrameworkBackerMental modelSweet spot
StrandsAWSHarnessEnterprise agents inside IAM/Bedrock/CloudWatch fabric
Google ADKGoogleHierarchical treeGemini-native agents with A2A protocol and multimodal input
Pydantic AIPydantic teamHarnessType-strict Python codebases with dependency injection
smolagentsHuggingFaceCode-as-actionAgents that write and execute Python instead of JSON tool calls
LlamaIndexLlamaIndexRAG + agentsRetrieval-heavy problems where getting the right context matters most
Semantic KernelMicrosoftEnterprise harness.NET ecosystem, Azure-integrated deployments

Two things stand out from this table. Smolagents' "agents think in code" approach is significant: instead of producing JSON tool calls, the agent generates Python that runs in a sandbox. This compresses both input tokens (no tool schemas) and intermediate state (results flow through variables, not context). Anthropic has been writing about the same idea under "code execution with MCP." And LlamaIndex's retrieval primitives remain best-in-class if your problem is fundamentally about getting the right information into the model.


the protocol layer underneath all of this

The most important development, more important than any individual framework, is the emergence of protocols that sit beneath the frameworks. The two that matter are MCP and A2A.

The agent stack Three-layer stack: frameworks at the top, integration protocols in the middle, foundation models at the bottom. Frameworks Control flow, orchestration LangGraph Agno n8n Agent SDKs Protocols Tool and agent integration MCP A2A Models Foundation LLMs Claude GPT Gemini Open weight

MCP, the Model Context Protocol, was introduced by Anthropic in November 2024 to solve what they called the N x M integration problem: every model times every tool times every data source equals a custom integration to write. MCP defines a standard JSON-RPC protocol between an MCP client (your AI application) and an MCP server (a tool or data source), so a tool implemented once works with any MCP-compatible client. The original spec defined three primitives (tools, resources, prompts) and has since expanded to five, with sampling and roots added.

The adoption curve was unusually fast for a standard. By March 2025, OpenAI added MCP support to ChatGPT. By Q3 2025, Microsoft shipped MCP servers for GitHub, Azure, and Microsoft 365. By Q1 2026, Google added MCP to Gemini and Vertex AI. In December 2025, Anthropic donated MCP to the Linux Foundation's new Agentic AI Foundation, jointly governed with Block, OpenAI, AWS, Google, and Microsoft, which removed the last political reason for non-Anthropic players to resist it. The public registry now lists thousands of servers. SDK downloads cross 97 million per month. It is, at this point, the integration standard.

You don't write custom integrations for Slack, GitHub, Postgres, or Google Drive anymore; you point your agent at the appropriate MCP server. Frameworks compete on agent control flow, observability, and developer experience, while the integration layer becomes commodity infrastructure. This is what happened with HTTP for the web and JDBC for databases.

A2A, Agent-to-Agent, is the complement. Where MCP lets an agent talk to tools and data, A2A lets agents talk to other agents across framework boundaries. A LangGraph agent can invoke a CrewAI agent through A2A's standardized task interface. Adoption is still earlier than MCP but trending similarly, and Google's ADK ships with native support.

A second important development at this layer is what Anthropic has been calling "code execution with MCP" and what smolagents has done from the start. The insight is that as the number of available tools grows into the hundreds or thousands, loading every tool definition into the context window becomes prohibitively expensive in tokens. Instead, you expose tools as code on a virtual filesystem, give the agent a search-tools function and a code execution sandbox, and let it pull in only the tools it needs. The agent generates and runs a small script that calls the tools, instead of producing JSON for each tool call. This compresses both the input (tool definitions) and the intermediate state (tool outputs flow through code variables, not through the model's context), and it's how production agents at the high end will increasingly work.


where things actually are right now

Four things define the field right now.

The harness philosophy is winning. LangChain itself is now shipping deepagents, an explicitly harness-style API on top of langgraph, while keeping the graph layer underneath for cases where you do need it. As models get better, more orchestration moves into the model and out of the framework. The four most-cited production frameworks are now LangGraph, Claude Agent SDK, OpenAI Agents SDK, and Strands, with CrewAI as the role-based alternative and Agno as the pure-Python performance play.

MCP won. If you are designing a new agent system today, assume MCP is your integration layer, not an option you add later.

Most multi-agent hype was wrong. Most problems people thought needed multi-agent systems turned out to work better with a single well-equipped agent, good tools, and clear instructions. Multi-agent makes sense for independent specializations (a planner and an executor, or a generator and a critic) and for parallel exploration. LangChain's own published guidance: 80% of real applications work better as a single agent.

Single agent vs multi-agent: the 80/20 rule Single Agent (80% of real problems) Agent Search Code Files API Browser DB āœ“ Simpler debugging āœ“ Lower latency āœ“ Cheaper The 80/20 Rule Multi-Agent (when you actually need it) Planner Executor Critic Parallel exploration Generator + critic pairs Different system prompts

"Agent" is fragmenting into useful subcategories. Coding agents (Claude Code, Cursor, OpenHands, Aider), browser agents (Claude in Chrome, browser-use), workflow agents (n8n, Make), research agents (deepagents, Claude's research mode), customer-facing agents (support bots, phone agents). The general-purpose framework matters less when the agent is specialized. And across all categories, observability (LangSmith, Langfuse, Arize) and governance (guardrails, audit logs, human-in-the-loop) have become their own markets.


how to actually choose

You usually don't pick one. Real systems combine multiple frameworks: a LangGraph control flow calling MCP tools, embedded inside an n8n workflow for the integration plumbing, with observability through LangSmith or OpenTelemetry. The frameworks are not competing for the same slot in your stack; they're competing for different slots.

If forced to pick a single axis: pick LangGraph if you want maximum control and your problem has a real, knowable control flow structure; pick Agno if you want pure-Python ergonomics, fast iteration, and stateless scaling; pick Claude Agent SDK or deepagents if your problem is open-ended and you want to bet on the model; pick CrewAI if you have a team-of-specialists mental model and want to ship a prototype this week; pick n8n if your problem is mostly deterministic plumbing with AI in a few spots and a visual builder helps you reason about it; pick OpenAI Agents SDK if you're already deep in OpenAI's stack and want the path of least resistance. Skip AutoGen unless you're doing research or specifically need multi-agent dialogue.

Which agent framework should you pick? What's your agent problem? Know the control flow in advance? YES NO Mostly deterministic plumbing + some AI? Y n8n N Need explicit state + checkpointing? Y LangGraph N Agno Open-ended task? (research, coding, debug) YES NO Primary model provider? Anthropic OpenAI Multi Claude SDK OpenAI SDK Strands Team-of-specialists metaphor fit? YES CrewAI NO Need multi-agent dialogue? YES AutoGen NO Agno or Claude SDK