Write the Spec Before the Code. AWS Kiro Takes Aim at Vibe Coding's Weak Spot

What Vibe Coding Was Missing Was “Five Minutes Before Writing”

Up to last week, I wrote four articles. The theme was vibe coding security.

I dug into the Moltbook leak of 1.5 million API keys. I tracked Cursor CEO’s “fragile foundation” comment. I cross-checked with Forrester’s three principles. I also mapped out the turning point of “from vibes to governance.”

Throughout the writing, I kept feeling something. The problem isn’t just security. Quality itself is unstable.

Open Cursor. Type “I want to build a feature like this.” Something working comes out. Fun.

But look at it again the next week. The design intent is unclear. Tests don’t pass. It can’t handle spec changes.

There’s a big gap between “it ran” and “it’s usable.”

A piece of data revealed the true nature of this gap. The number Stack Overflow’s 2024 survey showed. 63% of AI coding tool users responded that they spend “more time than expected” on debugging. Even if you can write fast, fixing takes time. This is vibe coding’s Achilles’ heel.

There’s an IDE called Kiro that AWS made. It’s a tool that comes head-on to fill this gap.

The idea was simple. Write the spec before the code. Just this one thing structurally changes the quality problem.

As a former failed engineer, let me lay out Kiro’s spec-driven development.

What Is AWS Kiro? An IDE Where “Structure Comes First, Code Comes Later”

Kiro workflow diagram. Natural language → requirements.md → design.md → tasks.md → implementation → testing

Kiro is an AI-powered code editor developed by AWS. It’s built on top of VS Code (Visual Studio Code).

The look resembles Cursor. But the design philosophy is completely different.

Cursor’s weapon is speed. AI completes incrementally as you write.

Claude Code’s weapon is autonomy. It understands the entire repository and proposes changes.

Kiro’s weapon is structure. It’s a mechanism that auto-generates three documents before you write a single line of code.

The first is requirements.md (requirements). It defines what you’re building. It writes user stories and acceptance criteria.

The second is design.md (design). It decides how you’ll build it. It includes technical design, DB schema, and sequence diagrams.

The third is tasks.md (tasks). It’s the implementation playbook. You knock items off one by one in checklist form.

You tell it “I want to build an app like this” in natural language. Kiro generates the three files. You check the contents and give it the OK. Then you proceed to implementation.

This approach is called Spec-Driven Development.

The preview was released in July 2025. GA (general availability) began in November of the same year.

According to the official site, over 250,000 people used it during the preview alone.

Pricing starts with a free plan (50 credits/month). Pro is $20/month for 1,000 credits. Pro+ is $40/month for 2,000 credits.

The free tier is enough to try it out. The barrier to trying is low.

Here’s an important point. Kiro is not “a tool where AI writes code.” It’s more accurate to understand it as “a tool where AI writes the design.” Code comes out as the result of design. This difference in order directly impacts quality.

Kiro also has a mechanism called the “Steering Hook.” When code starts to drift from the spec during the implementation phase, it references requirements.md and prompts course correction. The structure has the spec continuously monitor the entire implementation as a “living document.”

EARS Notation: Jet Engine Safety Standards Come Down to Vibe Coding

Kiro’s requirements definition uses EARS notation. The official name is Easy Approach to Requirements Syntax. Pronounced “ears.”

The place where this notation was born is interesting. It’s Rolls-Royce’s jet engine division.

It was developed in 2009 by Alistair Mavin and others. It’s a method for describing airworthiness regulations for jet engines. “Under this condition, the engine behaves like this” — written without ambiguity. That’s the syntax that was born for this purpose.

The basic form looks like this.

# EARS notation basic form
# While (precondition) + When (trigger) + shall (system response)

WHEN a user submits a contact form with valid data
THE SYSTEM SHALL send confirmation email within 5 minutes
AND display success message

There are five patterns.

Ubiquitous: Requirements that always apply. “Log all requests”
Event-driven: Upon specific event occurrence. “Create session on login”
State-driven: While in a specific state. “Reject writes during maintenance”
Unwanted behavior: Handling errors. “Lock after 3 failed authentications”
Optional feature: Conditional features. “CSV export if premium”

According to the EARS official site, it’s adopted by Airbus, NASA, Bosch, and Intel. Already validated on the front lines of aerospace, automotive, and semiconductors.

Why does this work for vibe coding? Let me show with a concrete example.

Prompt without EARS notation: “Build a login screen.”

Requirements with EARS notation:

WHEN a user attempts to log in
THE SYSTEM SHALL validate email and password format
IF the credentials are incorrect 3 consecutive times
THE SYSTEM SHALL lock the account for 30 minutes
AND send a security notification to the registered email

Can you tell what’s undefined in the prompt “build a login screen”? Lockout conditions, presence of notifications, lock duration, recovery method. All of it is undefined. AI won’t write what’s not in the prompt.

Just by using EARS notation’s “unwanted behavior” pattern, error cases get naturally surfaced. “What happens after 3 failures” is something you’ll forget to write unless you consciously think about it. But if you have the template “IF the credentials are incorrect…”, you can think in terms of filling that in.

Jet engine safety standards have come down to quality control for AI coding. This sensation of “coming down” was personally the most exciting point.

”Months → 3 Weeks.” A Pharma AI Case Shows Kiro’s Power

Theory alone lacks persuasion. What actually happened with Kiro. Let me introduce two cases.

The first is a pharma AI case. It’s published on the AWS Industries Blog.

Three developers in three weeks built a production-ready drug discovery agent.

The data it handles is massive. There are over 30 biomedical DBs including PubMed and ChEMBL. PubMed alone adds about 1.5 million papers per year.

The system integrates this massive data across sources. It identifies therapeutic targets and produces evidence-based recommendations. Conventionally, this is the scale of project that takes months.

The key to finishing in three weeks lies in requirements.md. They defined “from which DB to fetch what” upfront. They also specified “the method for evaluating evidence reliability.” They decided “the recommendation output format.” All before the implementation stage.

The spec was solid before writing code. That’s why even three people could move their hands without hesitation.

The second is the case of Rackspace Technology. This is even more dramatic. They completed work estimated at 52 weeks in 3 weeks. A 90% efficiency improvement is reported.

What I want to focus on isn’t just speed. Because the spec exists first, you can verify the finished product against the spec. “It ran but I don’t know if it’s correct.” This anxiety is structurally eliminated.

From my experience doing CS at my previous job, let me say this. “Development running with vague requirements” is the classic pattern of a project going up in flames. Even in the era where AI writes the code, this principle doesn’t change.

Kiro × Cursor × Claude Code: Mapping the Three Philosophies

Comparison diagram of Kiro, Cursor, and Claude Code design philosophies. The structure vs. speed vs. autonomy triangle

I’m not trying to say “Kiro is the strongest.” The three tools are solving different problems.

Cursor’s design philosophy is in “speed.” Sub-200ms tab completion keeps your hands moving. It’s good at bug fixes, refactoring, and small-scale prototypes. Irreplaceable in scenes where speed is everything.

Claude Code’s strength is “autonomy.” It runs from the terminal and rides on Git workflows. Strong at large repository modifications. Good at changes spanning multiple files.

Kiro’s creed is “structure first.” Suits launching new projects. There’s traceability from requirements to tasks. It shines in team development and scenarios that require audit.

Here’s the rough guide for usage.

Build prototypes fast → Cursor
Make big changes to existing code → Claude Code
Launch new projects with quality built-in → Kiro

I use all three together. Initial design with Kiro. Daily fixes with Cursor. Large refactoring with Claude Code.

For example, when building a new business tool. First generate requirements.md with Kiro. Once the design is solid, run daily feature additions through Cursor. Three months later, if a large structural change is needed, leave it to Claude Code.

It’s not “which is on top.” Choose by “what you want to solve.” The fact that the tool options are expanding is itself progress.

What “Writing the Spec First” Actually Feels Like

After theory and case studies, let me talk about my own experience. I tried designing a business tool with Kiro’s free plan.

What I wanted to build was a Slack feedback auto-classification tool. It transcribes posts from a specific channel to a spreadsheet by category. I’d half-built it in Cursor before and left it abandoned.

The reason I gave up in Cursor was getting stuck on “classification criteria.” How do you separate bug reports from feature requests? How do you handle classifications with low confidence? Once I started thinking, my hands stopped.

I input the Slack feedback classification tool requirements into Kiro.

The generated requirements.md surprised me.

# requirements.md (part of the requirements Kiro auto-generated)
# Acceptance criteria defined in EARS notation

## User Story 1: Auto-classification of feedback
WHEN a message is posted in the designated Slack channel
THE SYSTEM SHALL classify the message into one of:
  - bug_report
  - feature_request
  - praise
  - question
AND store the classification with confidence score

## Unwanted behavior
IF the classification confidence is below 0.7
THE SYSTEM SHALL flag the message for manual review
AND notify the admin channel

“Route to manual review if confidence is below 0.7.” This requirement didn’t occur to me when I was building it in Cursor. EARS notation’s “unwanted behavior” pattern surfaced the error case for me.

design.md was also generated. The connection design from Slack API → classification engine → Google Sheets API is written out. tasks.md lined up 12 implementation tasks in checklist form.

The whole picture is visible before writing code. The peace of mind is qualitatively different from Cursor’s “just get it running.”

When I actually proceeded with implementation, I understood the source of that peace of mind. Time spent wondering “what to do next” was zero. All I had to do was knock down tasks.md from number 1 in order. Even while writing code in Cursor, “which part of requirements.md am I implementing right now” was always clear.

To be honest, spec generation takes 10–15 seconds per run. If you’re used to instant completions, it’s a bit annoying.

But think about the design precision you get in 15 seconds. The “what should the classification criteria be” problem that had me stuck for 30 minutes in Cursor was solved in 15 seconds of requirements.md generation.

“Just build something that works” is my philosophy. That hasn’t changed. What changed is that “spend just 5 minutes confirming the spec before running it” was added to the procedure. Just 5 minutes saves 3 hours of later fixes.

Vibe Coding’s Quality Problem Wasn’t About “Tools” but “Order”

Let me organize what I’ve covered.

The cause of unstable quality wasn’t insufficient tool performance. The step of defining the spec was missing. That’s all there is to it.

What Kiro proved is the effect of “reversing the order.”

Pharma AI: Months → 3 weeks. Locked down requirements first with requirements.md
Rackspace: 52 weeks → 3 weeks. 90% efficiency gain with spec-driven approach
EARS notation: A method derived from jet engine safety standards can be used for AI coding quality control
Over 250,000 users: You can start with the free plan

I chased the security problem in the trilogy. I wrote about the transition “from vibes to governance” in the fourth installment. This conclusion goes one step further.

The true nature of governance was “writing the spec first.”

What I Want You to Try Just Once

When you try Kiro, you don’t have to start with a big project. You can experience it in these steps.

Register for the free plan at kiro.dev
Think of one tool you keep meaning to build in Cursor but keep putting off
Type into Kiro “I want to build ◯◯. First, generate requirements.md”
Read the generated requirements. Look for items where you think “I hadn’t thought of this”

If even one “I hadn’t thought of this” comes up at step 4, Kiro’s value is proven. The experience of reading the spec and realizing “ah, this is how much we needed to decide” is the entry point to vibe coding’s next stage.

You became conscious of security. You understood the importance of governance. And then you build the habit of writing the spec first. With these three steps, vibe coding transforms from “play” to “weapon.”

I remember the time I worked with professional engineers. The first thing they did was requirements definition. Writing code was last.

Back then, I couldn’t do that “first design” myself. I thought I lacked technical skill.

Kiro is a tool where AI does that design for you.

I felt the experience of a master engineer dwelling within me with Cursor. The conviction deepened with Claude Code. What Kiro added is “the power to design.”

You can write. You can fix. And you can design. Vibe coding has entered the next stage.

If anyone feels anxious about vibe coding right now, I want you to try “changing the order” rather than “changing the tool.” Write the spec first. Just that alone changes the experience.

With kiro.dev’s free plan, please just try generating requirements.md once. I want to deliver this experience to people who feel similar walls.

Write the Spec Before the Code. AWS Kiro Takes Aim at Vibe Coding's Weak Spot

What Vibe Coding Was Missing Was “Five Minutes Before Writing”

What Is AWS Kiro? An IDE Where “Structure Comes First, Code Comes Later”

EARS Notation: Jet Engine Safety Standards Come Down to Vibe Coding

”Months → 3 Weeks.” A Pharma AI Case Shows Kiro’s Power

Kiro × Cursor × Claude Code: Mapping the Three Philosophies

What “Writing the Spec First” Actually Feels Like

Vibe Coding’s Quality Problem Wasn’t About “Tools” but “Order”

What I Want You to Try Just Once

次に読む

MicrosoftがClaude Codeのライセンスを社内で止めたと報じられた。VS Codeを生んだ会社が「競合」を切った日を、元・挫折エンジニアが整理する

BBC記者のラップトップが乗っ取られた。Orchids事件と「3週間で4回鳴った警報」を、元・挫折エンジニアが整理する

コーディングはコモディティになった、とArsh Goyalが言い切った。元・挫折エンジニアが「エンジニアリング思考」の正体を全部書く