Building Agentic Growth Systems with Claude Code
There is a quiet ceiling in workflow automation that almost nobody talks about. You can build a beautiful n8n graph that pulls a lead, enriches it, scores it, routes it, alerts a rep. It runs. It works. And then a real-world edge case lands and your workflow can’t reason about it, so you add another If/Then node, and another, and another, until the thing becomes unmaintainable. That’s the ceiling.
Agentic growth systems are what sit above that ceiling. They are built on a different primitive: a model that reasons, picks tools, calls APIs, reads memory, and decides what to do next. Over the last 90 days I have moved most of the operational stack at Momentum Nexus, and at three of our partner companies, off pure workflow tools and onto Claude Code as the agent runtime. Daily blog production. Funding signal alerts. Outbound enrichment. Pre-sales research. Weekly campaign reports. All of it now runs on the same stack: a few well-written skills, scheduled via cron, with subagents for parallel work and MCP servers for external systems.
This post is the deep dive I wish someone had handed me when I started. It is not a “what is an agent” essay. It is the architecture, the actual primitives, three reference patterns we run in production, the cost math, and the four-week build path. If you operate a B2B SaaS growth team in the $50K to $150K MRR range and you have already pushed n8n or Zapier as far as they go, this is the layer above.
Workflow Automation Hits a Ceiling
I want to make the limits concrete, because they are rarely articulated cleanly.
A workflow tool, at its core, is a directed graph of steps. Trigger fires, data flows through nodes, each node transforms or routes, output lands somewhere. This pattern handles three kinds of work beautifully: synchronous transformations, simple deterministic routing, and stateless integrations.
The pattern breaks on three other kinds of work, which happen to be the highest-leverage growth tasks.
Tasks that require reasoning. “Score this lead 0 to 100 against our ICP” is not a deterministic transform. It is a judgment call that depends on context the lead might not surface explicitly. You can fake it with rules, but rules don’t survive contact with messy enterprise data.
Tasks with branching ambiguity. “Read this customer email and decide whether it’s a churn risk, a feature request, an objection we can answer, or an upsell signal.” Workflow tools handle this by routing on keywords, which works for the obvious 60% and fails for the 40% that actually matters.
Tasks that need cross-system memory. “Did we already follow up on this lead three weeks ago, and what did they say last time?” A workflow has no memory. You can wire up a database, but now you are building an agent runtime in pieces.
I have seen growth teams paper over these limits by stacking 200+ node workflows with embedded JavaScript, custom databases, and cron-driven sub-flows. At that point you have built an agent system. Badly, with no observability, in a tool optimized for something else.
Workflow automation is not wrong. It is the floor. Agentic systems are what you build when the floor is no longer enough.
What “Agentic” Actually Means (And What It Doesn’t)
The term gets thrown around loosely, so let me draw the line precisely.
A system is agentic when it has all four of these properties.
-
A reasoning loop. A model receives an input, decides what to do, takes an action, observes the result, and decides the next step. Multiple iterations per task, not a single inference call.
-
Tool use. The model calls external functions: read files, query APIs, execute shell commands, send messages. The actions are real, not just text outputs.
-
Memory. Information persists across the loop, across the session, and ideally across runs. The agent remembers what it learned about the user, the project, or the data.
-
Autonomy within constraints. The system decides which steps to take, how many to take, and when to stop. You set the goal and the guardrails. You don’t script the path.
A Zapier “AI step” that calls GPT-4 to summarize a Slack message is not an agent. It is a workflow with an inference node. There is no loop, no real tool use, no memory, no autonomy. It is an LLM as a feature.
| Workflow + LLM step | Agentic system | |
|---|---|---|
| Control flow | You hardcode the path | Model decides the path |
| Inference calls | One per workflow run | Many per task, in a loop |
| Tool use | Fixed set wired by you | Selected by the model from a registry |
| Memory | Per-step variables | Persistent across sessions |
| Failure mode | Brittle on edge cases | Recovers via re-reasoning |
| Cost shape | Predictable, low | Variable, higher per task, but compresses headcount |
The reason this distinction matters operationally is a trade-off: agentic systems trade predictable per-run cost for variable per-run cost in exchange for handling work that workflows cannot handle at all. If you have read our take on what AI agents actually do for B2B sales, this is the architectural counterpart.
Why Claude Code Is the Runtime, Not Just an Editor
Most people first encounter Claude Code as a coding assistant. That is the surface area. Underneath, it is an agent runtime that happens to be excellent at software, and the two abstractions matter independently.
When I run a Claude Code session, the runtime gives me four things that no general-purpose LLM API gives me out of the box.
A primitive set designed for agents, not chat. Skills, subagents, tools, hooks, and persistent memory are first-class concepts. You don’t bolt them on; you compose with them.
Headless execution. claude -p "<prompt>" runs an entire reasoning loop from a shell script. With --allowedTools, --model sonnet, and --max-budget-usd 5 you have a sandboxed agent runner you can drop into a cron job. This is the single most underrated feature for ops teams.
A file system as the world model. Skills live in folders, memory lives in files, configuration lives in settings.json. The state of your agent system is git-able. You can review it, diff it, roll it back. That sounds basic, but it eliminates the entire class of “what changed in production?” problems that plague no-code platforms.
Session continuity. --continue and --resume let an agent pick up where it left off. Combined with hooks, this means you can build long-running, multi-step processes that span days without losing state.
Compared to building this from scratch with the Anthropic SDK, Claude Code saves you somewhere between 200 and 800 hours of plumbing depending on the breadth of your use case. Compared to Zapier or n8n, it gives you reasoning and memory you cannot get at all. Compared to AutoGPT-style frameworks, it has actual production discipline: budget caps, permission modes, structured output, MCP integration, observability via hooks.
It is, on balance, the cleanest agent runtime I have used. The entire MN ops stack runs on it.
The 5 Primitives of an Agentic Stack
Every agentic system we build at Momentum Nexus composes from the same five primitives. This is the mental model worth carrying.
┌─────────────────────────────────────────────────────────┐
│ AGENTIC STACK │
├─────────────────────────────────────────────────────────┤
│ │
│ SKILLS ← reusable instructions + scripts │
│ ↓ │
│ SUBAGENTS ← isolated context, parallel work │
│ ↓ │
│ TOOLS / MCP ← external systems and APIs │
│ ↓ │
│ HOOKS ← deterministic lifecycle events │
│ ↓ │
│ MEMORY ← persistent state across runs │
│ │
└─────────────────────────────────────────────────────────┘
Skills. A skill is a folder containing a SKILL.md (instructions) and optional helper scripts. It lives at ~/.claude/skills/<name>/SKILL.md for personal skills or .claude/skills/<name>/SKILL.md for project-scoped ones. The frontmatter declares when the skill should activate. You invoke it as /skill-name or the model triggers it automatically when the description matches the task. Think of skills as the “playbook layer”: they encode how to do a thing, including which subagents to call, which scripts to run, and what output to produce.
Subagents. A subagent is an isolated AI assistant with its own context window and tool permissions. You define them in .claude/agents/<name>.md. You can invoke many in parallel from the main agent, each working on an independent piece of the task. Subagents are not webhooks or async workers, they are sibling reasoning loops that return summarized results to the parent. This is what makes parallelism cheap.
Tools and MCP servers. Tools are functions the agent can call: read a file, run a shell command, send an HTTP request. The Model Context Protocol (MCP) is an open spec for tool servers. A .mcp.json file in your project root wires up servers for Slack, Notion, HubSpot, Google Sheets, custom internal APIs. Once wired, the agent sees them as native tools. No webhook plumbing.
Hooks. Hooks are deterministic shell commands that run at lifecycle events: PreToolUse, PostToolUse, UserPromptSubmit, Stop. You configure them in settings.json. They are not the agent reasoning, they are the rails. We use hooks to auto-commit edits, to send Slack alerts on session end, and to enforce permission checks on destructive commands.
Memory. Two layers exist. The first is CLAUDE.md, a file you write that the agent loads as context every session, ideal for static rules and conventions. The second is auto-memory: ~/.claude/projects/<id>/memory/MEMORY.md plus topic files, which the agent writes itself based on what it learns over time. Memory is what turns a one-shot agent into something that compounds across sessions.
These five primitives compose. A skill orchestrates subagents. A subagent calls MCP tools. A hook logs the session. Memory persists what was learned. Cron triggers the whole thing on a schedule. That is the entire architecture.
Pattern 1: The Self-Operating Daily Content Agent
The first system we built on this stack writes the post you are reading.
Every weekday at 09:07 Istanbul time, a system cron job fires daily-blog.sh. The script invokes Claude Code in headless mode with a budget cap and a single command: run the /mn-blog-writer skill. The skill reads a YAML topic queue, picks the next pending topic, runs an overlap check against existing posts via the GitHub API, identifies SEO keywords, performs research via WebSearch and a curated reference library, writes a 3,000 to 4,000 word draft, runs a self-review quality gate, generates a newsletter snippet, pushes the MDX file to the website repo via gh api, marks the topic as done in the queue, and commits the working copy to a backup repo.
Total human time per post: 0 minutes for routine runs. About 10 minutes when I want to override the topic or angle.
Here is the actual stack, primitive by primitive.
| Primitive | What it does in this system |
|---|---|
| Cron | Triggers daily-blog.sh weekdays 06:07 UTC |
| Headless mode | claude -p "Run /mn-blog-writer" --model sonnet --max-budget-usd 5 --allowedTools "Bash,Read,Write,Edit,Grep,WebSearch,WebFetch,Agent" |
| Skill | /mn-blog-writer orchestrates the full pipeline |
| Subagents | Research subagent runs WebSearch in parallel with a tone-calibration subagent |
| MCP / external | gh api for GitHub push, Brevo HTTP API for email |
| Hooks | notify_success.py and notify_error.py run at exit, send Brevo emails to me |
| Memory | Reference library YAML files per content cluster, plus auto-memory of topics already covered |
Cost per run: between $0.40 and $1.20 in Sonnet tokens. Hard cap at $5 via --max-budget-usd. Failure rate over 60 days: 4%, almost entirely from transient WebSearch timeouts which retry on the next run.
What this replaces: a content marketer at $4,000 to $6,000 per month, plus an SEO contractor, plus a newsletter writer. Not because the agent writes better than a human, but because it produces 80th percentile output on a daily cadence that no human team will sustain. The team time gets reallocated to higher-leverage work like distribution and partnerships.
This is the canonical pattern. Cron, headless skill, budget cap, GitHub as the publishing target. We have replicated it for funding alerts, weekly reports, pre-sales briefs, and outbound list generation. Same shape every time.
Pattern 2: The Signal-Driven Outreach Agent
The second pattern triggers on external signal rather than schedule.
We run a daily funding signal agent for a portfolio company’s sales team. At 11:00 Istanbul time, a cron job fetches RSS feeds from two Turkish funding news sources, deduplicates against a SQLite seen-URLs database, filters for events in the last 24 hours, and passes the survivors to Claude Code for enrichment.
Claude Code’s job, executed via claude -p, is to do the work no RSS reader can do: read each article, identify the company, extract the funding amount and round, look up the founders via web search, classify whether the company is a fit for the sales team’s ICP, and generate a per-company sales pitch tailored to that company’s situation. Output is structured JSON.
A second script then renders an HTML email digest and ships it via Brevo to the sales team, but only if there are matches worth sending. No matches means no email, which respects inbox attention.
| Stage | Implementation |
|---|---|
| Trigger | System cron, 08:00 UTC daily |
| Pre-filter | Python script: RSS fetch, dedup, recency filter |
| Agent invocation | claude -p per company with structured output schema |
| Reasoning | Web search for founder context, ICP classification, pitch generation |
| Post-processing | Python script: HTML render, Brevo send |
| Memory | SQLite for seen URLs, prevents repeat sends |
The reason this cannot be a pure n8n workflow: the per-company enrichment requires reading prose, making judgment calls, and generating bespoke pitches. The reason it cannot be a pure agentic loop: the dedup and the recency filter are deterministic and don’t need reasoning, so paying for inference there would be wasteful. The right architecture is a sandwich: deterministic pre-filter, agentic middle, deterministic post-process.
That sandwich pattern is the second canonical shape. We use it for weekly campaign reports (deterministic pull from HeyReach API, agentic narrative summary, deterministic email send), pre-sales research (deterministic company URL lookup, agentic deep research, deterministic PDF render), and inbound lead enrichment.
If you have ever wrestled with the gap between AI workflows that promise to save time and ones that actually do, this sandwich is the answer. The agentic middle is where the leverage sits. The deterministic edges are what make it operable.
Pattern 3: The Multi-Agent Lead Operating System
The third pattern is where subagents earn their keep.
A standard inbound lead enrichment agent does its work serially: fetch company data, score against ICP, route to a rep, log to CRM. That is fine for a few hundred leads a month. At higher volume, the serial loop is too slow and the context window gets bloated with intermediate enrichment data the final routing decision does not need.
The multi-agent version splits the work across subagents. The orchestrator agent receives the lead webhook, then spawns three subagents in parallel.
Enrichment subagent. Pulls company data from Apollo or Apify, identifies industry, headcount, tech stack, recent funding. Returns a structured profile.
Intent subagent. Reads any free-text fields the lead submitted, classifies intent (pricing inquiry, support, partnership, RFP), extracts the specific question being asked.
Account history subagent. Checks CRM for prior interactions, deals, conversations, blocklist matches. Returns a one-paragraph summary.
The orchestrator waits for all three, synthesizes the result, decides routing, calls the HubSpot MCP server to create the contact and assign it, calls the Slack MCP server to alert the rep with a context-rich message that includes the intent, the company snapshot, and the prior history. Total time from lead submission to rep notification: 30 to 60 seconds, with the parallel subagents cutting roughly 70% off the serial baseline.
Why this matters operationally is subtle. Each subagent has its own context window, so the orchestrator never sees the messy enrichment data. It only sees the summarized result. This keeps reasoning quality high on the routing decision because the orchestrator’s context is not drowning in irrelevant detail.
| Subagent | Tools | Output to orchestrator |
|---|---|---|
| Enrichment | Apify MCP, web search | 8-line company profile |
| Intent classifier | None (pure reasoning) | Intent label + question summary |
| Account history | HubSpot MCP, internal DB | One-paragraph history |
The same multi-agent shape generalizes to any RevOps task with multiple parallelizable lookups: account research before a sales call, churn risk scoring across signals, opportunity reviews. The pattern is what is reusable, not the specific subagents.
This is the pattern that finally makes the case for treating growth as engineering, not marketing operational. Subagents are how you get from “we automated some tasks” to “we operate a system.”
The Operating Math: Cost, Reliability, Observability
I am going to be specific about the numbers, because too many agentic-system posts hand-wave on this.
Cost. Sonnet runs us between $0.40 and $1.20 per agent invocation across the patterns above. With a $5 hard cap per run via --max-budget-usd, the worst-case daily cost across all our cron-driven agents is around $25. Realistic daily spend is $4 to $9. For a growth team replacing two part-time roles, the math closes inside the first month.
Reliability. Across roughly 1,400 production runs over the last 90 days, we see a 93% to 96% clean-success rate. Most failures are transient (network timeouts on web search, API rate limits) and either retry on the next cron tick or surface to me via the error notification hook. About 1% of failures require code intervention, almost always because a referenced API changed schema.
Observability. Every cron-triggered agent writes to a cron.log file. Hooks send Brevo emails on success and failure with the slug or error message. For longer-running multi-agent flows, we use the JSON output mode to get structured session data including total tokens used, tool calls made, and timing per subagent. That is enough observability for this scale; if we were running ten times the volume we would add OpenTelemetry export.
Idempotency. This is the one most teams underbuild. Every agent we ship has a “did I already do this?” check before taking action. Funding signal agent: SQLite seen-URLs table. Daily blog agent: queue status field plus existence check on GitHub. Lead enrichment agent: HubSpot contact lookup before creating. Without idempotency you ship duplicate emails, duplicate Notion tasks, duplicate everything, the first time something retries.
| Metric | Daily blog agent | Funding signal | Lead enrichment |
|---|---|---|---|
| Avg cost per run | $0.85 | $0.30 | $0.15 |
| Median latency | 4 min | 90 sec | 45 sec |
| Success rate | 96% | 94% | 98% |
| Idempotency mechanism | Queue status | SQLite seen URLs | CRM lookup |
| Failure recovery | Next-day retry | Next-day retry | Manual review queue |
The numbers above are not theoretical. They are what we actually see, every day.
A 4-Week Implementation Path
If you are starting from a workflow-automation stack and want to add an agentic layer, here is how to sequence the build.
Week 1: Data hygiene and one skill. Audit the data your agents will read. Stale CRM records, broken integrations, missing fields are the single biggest failure mode for downstream agents. We covered the playbook in detail in how to fix CRM data in two weeks, and that work belongs before any agentic build, not after. While that is running, install Claude Code and write your first skill. Pick something small, a weekly pipeline summary or a competitor research brief. Get one skill working end-to-end manually before automating it.
Week 2: Headless mode and one cron job. Wrap your skill in daily-skill.sh with claude -p, --model sonnet, --max-budget-usd, and the minimum tool allowlist. Schedule it via system cron. Add Brevo or Slack notifications via a Stop hook so you know when it runs. Run it for five business days. Watch for failure modes. Fix them.
Week 3: MCP and external systems. Wire up the MCP servers you need: Slack, Notion, HubSpot, Google Sheets. Update your skill to call these as native tools instead of via custom HTTP requests. The code shrinks and the reasoning improves because the agent sees the tools first-class.
Week 4: Subagents and the second pattern. Take a task that would benefit from parallelism (lead enrichment, multi-source research) and decompose it into two or three subagents. Define them in .claude/agents/<name>.md. Have the main agent invoke them in parallel. Compare wall-clock time and reasoning quality against the serial version.
By the end of week 4 you have: one production cron-driven skill, one MCP-integrated agent, one multi-agent flow, and observability on all three. That is enough to replace 40% to 70% of a growth team’s manual operational work, depending on your starting point.
What you do not need in the first 30 days: a fancy UI, a custom dashboard, an agent marketplace, or a vector database. Those are useful at scale. They are pure distraction at the build phase.
Three Mistakes That Kill Agentic Projects
I have watched these three failure modes kill more agentic builds than any technical issue.
Agentic theatre. Building agents to look agentic, not to deliver outcomes. Symptom: a “lead qualification agent” that does what a five-line rule could do, plus an inference call. Cure: define the outcome before you define the architecture. If a rule works, ship the rule. Agents are for problems that need reasoning, memory, or autonomy, not for problems you have already solved.
No idempotency. Already covered above, but worth repeating because it is the one that bites you in week three of production. Every irreversible action (send email, create CRM record, post message) needs a “did I do this already?” check before it fires. The check should be cheap and durable. SQLite, a CRM lookup, a queue status field. Without it, your first retry storm sends the same email to your top 30 prospects four times. I have done this. It hurts.
No human in the loop on irreversible decisions. Some growth tasks are reversible, like sending a rep a Slack alert or drafting an email to review. Some are not, like sending the email, creating the deal, charging the card. Agentic systems should be configured to draft, queue, or alert on irreversible actions, not execute them autonomously. The 1% of cases where an agent confidently does the wrong thing on an irreversible action will burn more trust than the 99% of correct cases earn.
The pattern I use: reversible actions are auto-execute, irreversible actions are auto-draft with human approval. Sales emails: draft and Slack the rep for one-click approve. Deal creation: auto-execute, easy to undo. Customer-facing campaign emails: never autonomous.
The Bigger Shift
Workflow automation made it possible to stop doing repetitive tasks. Agentic systems make it possible to stop running operations. That distinction is the entire point.
When the daily blog writes itself, when the funding signal arrives in the sales team’s inbox before they have poured coffee, when inbound leads are enriched and routed before the form-submit confirmation finishes loading, what is happening underneath is a shift from “we automated steps” to “we built a system that operates itself, within bounds we set.” That is the layer growth teams in the $50K to $150K MRR range should be building toward over the next 12 months.
The teams that get there first will run with two-thirds the headcount and three times the operational throughput. The ones that don’t will spend the next five years building larger and larger workflow graphs in tools that hit a ceiling years ago.
If you want help mapping which parts of your growth ops are workflow problems and which are agentic problems, book a free growth audit and we will walk your stack with you. If you would rather start building, install Claude Code, write one skill, and put it on cron. The next layer of growth ops is closer than most people think.
Ready to Scale Your Startup?
Let's discuss how we can help you implement these strategies and achieve your growth goals.
Schedule a Call