What's the difference between vibe coding and agentic engineering?

Vibe coding is conversational AI development where the developer accepts generated code without reading it. Agentic engineering uses AI agents that plan, write, test, and iterate against a human-defined spec, with senior engineers reviewing architecture. Vibe coding is fast and disposable. Agentic engineering is fast and durable.

Who coined the term 'vibe coding'?

Andrej Karpathy, in a tweet on February 2, 2025. He described it as letting LLMs write code you barely read, accepting all suggestions, and pasting errors back without comment. He called it 'fun' for personal projects. The term went viral and got reapplied to professional contexts where the approach breaks down.

Not for what Karpathy described it as: weekend projects, hackathons, throwaway scripts, learning exercises. It's bad as a methodology for production systems. The 2025 incident at Amazon, which led to a 90-day reset on deployment controls, is the kind of failure mode you get when vibe-coded systems hit real users.

How do I know if a vendor uses vibe coding or agentic engineering?

Ask three questions before signing. One: show me the spec and acceptance tests for a recent build. Two: walk me through your code review process for AI-generated code. Three: what happens when an agent hallucinates an unsupported API? Agentic engineering teams have answers. Vibe coders don't.

Should an SMB owner replacing SaaS care about this?

Yes. A custom AI build that replaces a $500 per month SaaS subscription needs to hold for years to pay back. A vibe-coded version that breaks at 100 users costs more than the SaaS would have, including the rebuild. The methodology decides whether the replacement is real or not.

Vibe Coding vs Agentic Engineering: Why SMB Owners Should Care How Their Custom AI Gets Built

TL;DR

Vibe coding (Karpathy, Feb 2025) means letting an LLM write code you never read. Agentic engineering means agents plan, code, test, and iterate against specs you wrote, with senior review on every architectural call. The first ships demos. The second ships the system that actually replaces your SaaS subscription. If you’re an SMB owner about to commission a custom AI build, the methodology your vendor uses matters more than the framework they pick.

What Karpathy actually said, and what he didn’t

On February 2, 2025, Andrej Karpathy posted what he later called “a shower-of-thoughts throwaway tweet.” He described a way of working with Cursor Composer and Sonnet where he barely touched the keyboard, talked to the agent through SuperWhisper, accepted every suggestion without reading the diff, and pasted errors back without comment when something broke. He called it “vibe coding.”

The post got 4.5 million views. It became a brand. And the brand drifted.

Karpathy was talking about personal projects. He was not telling product teams to ship customer-facing systems this way. By March 2025, the term was being applied to internal tools, MVPs, even production code at startups. By Q3 2025, Amazon had to issue a 90-day reset on its deployment controls after an AI assistant produced what was internally described as “high blast radius changes” with inadequate safeguards.

The disconnect between Karpathy’s tweet and how the term got used is the whole story.

Why this matters if you’re replacing SaaS with custom AI

The most interesting trend in 2026 isn’t AI replacing software engineers. It’s SMB owners cancelling SaaS subscriptions and commissioning custom AI builds in their place.

A 50-person agency paying $1,200 a month for a project management tool starts wondering whether a $15,000 custom build, with AI baked in for status updates and reporting, would pay back faster. A real estate brokerage paying $4,800 a year for a CRM that doesn’t quite fit their workflow asks the same question. Across our pipeline at Kreante, this conversation has become the most common opening pitch we hear from prospects.

The math works on paper. It doesn’t always work in practice. And the variable that decides whether it works isn’t the framework, the AI model, or even the budget. It’s how the code gets made.

A custom AI build that replaces a SaaS subscription has to hold for years to pay back. If the build breaks at 100 concurrent users because no one designed for it, the rebuild costs more than the SaaS would have charged for the original term. If the build sits on hallucinated APIs that the underlying platform doesn’t actually support, you find out the hard way when the platform updates and your tool stops responding.

The vibe-versus-agentic distinction is exactly where this risk lives.

Vibe coding: where it works, where it kills you

Vibe coding works for what Karpathy described it as. Personal scripts. Weekend hackathons. Disposable demos. Learning exercises where the point is to accelerate exposure, not produce a maintainable artifact. Creative brainstorming, where over-generation is a feature.

It doesn’t work for systems anyone will depend on, and the failure modes are predictable.

Maintainability collapse. When the developer who shipped the system never read the code, no one understands the original context. Six months later, when a feature needs to change, the maintainer has to reverse-engineer intent from generated output. Most of the time they rewrite from scratch.

Cost blindness. AI-generated code doesn’t know what your billing structure looks like. It will write a loop that hits an API a thousand times instead of a single batch call, because the loop is grammatically correct. You won’t notice until the invoice arrives. Voitanos documented this pattern as one of the most common failure modes in early-stage AI builds.

Hallucinated APIs. Models will suggest technically plausible solutions that the underlying platform doesn’t actually support. DOM manipulation on SharePoint surfaces. Fields on Salesforce objects that don’t exist. Routes on Webflow that aren’t part of the API. The code looks reasonable. It runs in dev. It crashes the first time it touches production data.

Context loss. Every new session starts from zero. The agent doesn’t remember why a decision got made three weeks ago. It will undo subtle invariants because they aren’t visible in the file it was asked to edit.

A commenter on the Turing College piece put it best: “demos great, then reality arrives.”

Agentic engineering: AI as your sous chef

Agentic engineering inverts the relationship. The human writes a spec. The human writes the acceptance tests. The agent then plans, writes, tests, and iterates until the tests pass. The senior engineer reviews architecture, security boundaries, and integration points. The agent does the typing.

Voitanos uses a kitchen metaphor that lands: AI is the sous chef. Skilled, fast, capable. Working under your direction. Not the head chef.

This is what changes:

Specs as source of truth. When the spec is the artifact, not the code, the build process is auditable. New team members read the spec. AI agents code against it. Tests verify it. The codebase becomes legible in a way vibe-coded repos never are.

Senior review on architecture, not on character count. Senior engineers stop reviewing every line and start reviewing every architectural call. Does this scale past 100k users. Does this leak data across tenants. Does this depend on an API the vendor flagged as deprecated. The cognitive load shifts from production to judgment, which is where the senior expertise actually lives.

Cost discipline. When the spec includes performance requirements (latency budget, API call ceiling, token budget), the agent codes within them. Loops that should be batches get caught at the test layer, not the invoice layer.

Production-grade by default. Compliance hooks, observability, governance, retention metrics, audit logs. These are spec items, not afterthoughts. The agent ships them as part of the build, because the tests demand them.

This is what production-grade AI development looks like in 2026. It isn’t slower than vibe coding. It’s about the same speed for the first version, and dramatically faster for everything that comes after, because the second feature isn’t a rewrite.

Pros and cons, mapped to actual SMB decisions

Dimension	Vibe Coding	Agentic Engineering
Time to demo	Hours	Days to a week
Time to production	Often never	4 to 8 weeks
Code quality	Generated, unreviewed	Production-grade, reviewed
Maintainability	Collapses in months	Holds for years
Cost predictability	Hidden until invoice	Designed in via spec
Hallucinated API risk	High, undetected	Caught at test layer
Senior engineer role	Skipped	Architecture and review
Best fit	Personal scripts, hackathons	Anything you’ll depend on
SMB use case	Internal experiments only	SaaS replacement, customer-facing tools, ops automation

The line between the two columns is the line between a tool that costs you less than the SaaS it replaced, and a tool that costs you more, twice.

How to tell which approach a vendor uses

Three questions to ask before you sign. The answers separate signal from sales pitch.

1. Show me the spec and the acceptance tests for a recent build.

A vendor doing agentic engineering has these as artifacts. They’re not pretty PDFs; they’re working files in a repo. If the answer is “we work iteratively, we don’t write specs,” you’re hiring vibe coders. If the answer is “here’s the spec from our last fintech build with PII redacted, and the test file that drives the build pipeline,” you’re hiring engineers.

2. Walk me through your code review process for AI-generated code.

Listen for the rituals. “All AI output goes through the same PR process as human-written code, with senior approval on architectural changes” is the right answer. “We move fast, we trust the agents” is a red flag. Agentic teams have a review discipline. Vibe coders don’t review.

3. What happens when an agent hallucinates an unsupported API?

The right answer mentions guardrails: type checks against documented APIs, integration tests that fail when an endpoint doesn’t exist, sandboxed testing before any code touches a real service. The wrong answer is hand-waving. If your vendor doesn’t have a process for this, your custom build will eat the failure mode in production.

If a vendor passes all three questions cleanly, you’re talking to an agentic engineering team. If not, you’re paying for vibes.

What to do this week

Pull your last 12 months of SaaS bills. Highlight every recurring expense over $200 a month. For each, ask whether a custom AI build could replace the workflow at lower long-term cost. The candidates you find are the ones worth a conversation.

Then pick one. Take it to a real agentic engineering team. Ask the three questions. If the answers are clean, you have a path to replacing that subscription with something you own. If the answers are vibes, walk.

A custom AI build is only cheaper than SaaS if it actually works. The methodology decides.

For SMB owners evaluating vendors for an AI-native build, the production-grade vendor evaluation looks at spec-and-test artifacts, real case studies that signal engineering depth versus pitch decks, and clarity on which parts of the build the vendor handles versus expects you to manage.

Vibe Coding vs Agentic Engineering: Why SMB Owners Should Care How Their Custom AI Gets Built

TL;DR

What Karpathy actually said, and what he didn’t

Why this matters if you’re replacing SaaS with custom AI

Vibe coding: where it works, where it kills you

Agentic engineering: AI as your sous chef

Pros and cons, mapped to actual SMB decisions

How to tell which approach a vendor uses

What to do this week

Frequently asked questions

References

TL;DR

What Karpathy actually said, and what he didn’t

Why this matters if you’re replacing SaaS with custom AI

Vibe coding: where it works, where it kills you

Agentic engineering: AI as your sous chef

Pros and cons, mapped to actual SMB decisions

How to tell which approach a vendor uses

What to do this week

Related

Related articles

Frequently asked questions

References