Are spec-driven frameworks like Agent OS, BMAD, Superpowers or SpecKit still worth using, or have Claude Code and Codex made them redundant?
[Original Reddit post](https://www.reddit.com/r/ClaudeCode/comments/1t2mym5/are_specdriven_frameworks_like_agent_os_bmad/)
I've been trying to figure out where the community has landed on this, because I genuinely can't tell.
A year ago, the answer seemed obvious: if you're building anything non-trivial with LLMs, you need structured scaffolding — PRDs, memory layers, agent roles, task breakdowns. Frameworks like
BMAD-METHOD, Agent OS, Superpowers, and SpecKit
(and their cousins) exist precisely because raw LLMs drift, forget context, and produce spaghetti if you don't constrain them with specs upfront.
But now I look at
Claude Code
and
Codex
, for example, since they are the ones i'm using, and they feel... different? Claude Code does its own task decomposition, maintains context across files, and can self-correct mid-session without you babysitting a spec document. Codex feels similar — it reasons about the codebase, not just the prompt.
So I'm genuinely asking:
Do you still scaffold everything with a spec framework before touching Claude Code / Codex?
Or do you drop straight into vanilla agentic mode and only reach for a framework when things break down?
Or is the real answer that spec frameworks matter more now — because you're giving these powerful agents more autonomy, so the upfront spec is the only guardrail you have?
I built a mid-complexity product (a new vertical on top of an existing platform) using Claude Code and Codex with AgentOS as a seat belt. That was a year ago. I have a few projects upcoming to build and trying to decide whether investing in proper spec scaffolding is a force multiplier or just overhead that the model handles natively now.
Would love to hear from people who've shipped something real with either approach — not theory, actual experience.
I've been trying to figure out where the community has landed on this, because I genuinely can't tell.
A year ago, the answer seemed obvious: if you're building anything non-trivial with LLMs, you need structured scaffolding — PRDs, memory layers, agent roles, task breakdowns. Frameworks like
BMAD-METHOD
,
Agent OS
,
Superpowers
, and
SpecKit
(and their cousins) exist precisely because raw LLMs drift, forget context, and produce spaghetti if you don't constrain them with specs upfront.
But now I look at
Claude Code
and
Codex
, for example, since they are the ones i'm using, and they feel... different? Claude Code does its own task decomposition, maintains context across files, and can self-correct mid-session without you babysitting a spec document. Codex feels similar — it reasons about the codebase, not just the prompt.
So I'm genuinely asking:
Do you still scaffold everything with a spec framework before touching Claude Code / Codex?
Or do you drop straight into vanilla agentic mode and only reach for a framework when things break down?
Or is the real answer that spec frameworks matter
more
now — because you're giving these powerful agents
more
autonomy, so the upfront spec is the only guardrail you have?
I built a mid-complexity product (a new vertical on top of an existing platform) using Claude Code and Codex with AgentOS as a seat belt. That was a year ago. I have a few projects upcoming to build and trying to decide whether investing in proper spec scaffolding is a force multiplier or just overhead that the model handles natively now.
Would love to hear from people who've shipped something real with either approach — not theory, actual experience.
Worth mentioning... I'm originally a Product Leader who vibe codes now and not a software engineer.
submitted by
/u/3abwahab
Originally posted by u/3abwahab on r/ClaudeCode
I've been trying to figure out where the community has landed on this, because I genuinely can't tell.
A year ago, the answer seemed obvious: if you're building anything non-trivial with LLMs, you need structured scaffolding — PRDs, memory layers, agent roles, task breakdowns. Frameworks like
BMAD-METHOD, Agent OS, Superpowers, and SpecKit
(and their cousins) exist precisely because raw LLMs drift, forget context, and produce spaghetti if you don't constrain them with specs upfront.
But now I look at
Claude Code
and
Codex
, for example, since they are the ones i'm using, and they feel... different? Claude Code does its own task decomposition, maintains context across files, and can self-correct mid-session without you babysitting a spec document. Codex feels similar — it reasons about the codebase, not just the prompt.
So I'm genuinely asking:
Do you still scaffold everything with a spec framework before touching Claude Code / Codex?
Or do you drop straight into vanilla agentic mode and only reach for a framework when things break down?
Or is the real answer that spec frameworks matter more now — because you're giving these powerful agents more autonomy, so the upfront spec is the only guardrail you have?
I built a mid-complexity product (a new vertical on top of an existing platform) using Claude Code and Codex with AgentOS as a seat belt. That was a year ago. I have a few projects upcoming to build and trying to decide whether investing in proper spec scaffolding is a force multiplier or just overhead that the model handles natively now.
Would love to hear from people who've shipped something real with either approach — not theory, actual experience.
I've been trying to figure out where the community has landed on this, because I genuinely can't tell.
A year ago, the answer seemed obvious: if you're building anything non-trivial with LLMs, you need structured scaffolding — PRDs, memory layers, agent roles, task breakdowns. Frameworks like
BMAD-METHOD
,
Agent OS
,
Superpowers
, and
SpecKit
(and their cousins) exist precisely because raw LLMs drift, forget context, and produce spaghetti if you don't constrain them with specs upfront.
But now I look at
Claude Code
and
Codex
, for example, since they are the ones i'm using, and they feel... different? Claude Code does its own task decomposition, maintains context across files, and can self-correct mid-session without you babysitting a spec document. Codex feels similar — it reasons about the codebase, not just the prompt.
So I'm genuinely asking:
Do you still scaffold everything with a spec framework before touching Claude Code / Codex?
Or do you drop straight into vanilla agentic mode and only reach for a framework when things break down?
Or is the real answer that spec frameworks matter
more
now — because you're giving these powerful agents
more
autonomy, so the upfront spec is the only guardrail you have?
I built a mid-complexity product (a new vertical on top of an existing platform) using Claude Code and Codex with AgentOS as a seat belt. That was a year ago. I have a few projects upcoming to build and trying to decide whether investing in proper spec scaffolding is a force multiplier or just overhead that the model handles natively now.
Would love to hear from people who've shipped something real with either approach — not theory, actual experience.
Worth mentioning... I'm originally a Product Leader who vibe codes now and not a software engineer.
submitted by
/u/3abwahab
Originally posted by u/3abwahab on r/ClaudeCode