The Day 1.5 Million API Keys Leaked: Vibe Coding 2.0 Can No Longer Run on Vibes Alone
Last week, I finished writing my trilogy on the vibe coding safety debate.
The Week After I Finished My Trilogy, News of 1.5 Million Leaked Keys Broke
Last week, I finished writing my trilogy on the vibe coding safety debate.
I dug into The 170 Back Doors of Vibe Coding: A Former Burned-Out Engineer Seriously Investigates Why 10.3% of Lovable Apps Have Security Flaws. I chased down The Day the Cursor CEO Admitted “Our Tool Builds Fragile Foundations”: The Zero-Click Vulnerability CurXecute and the Next Question for Vibe Coding. I checked my answers against Decide Before You Let AI Write: Forrester’s Three Implementation Principles for “Secure Vibe Coding” and the Conclusion to the Trilogy.
I thought I had wrapped it up with “the theory is set; now it’s time to put it into practice.”
But reality was already running ahead of the theory. Wiz’s security research team published a blog post. The database for “Moltbook,” an SNS for AI agents, was wide open. The leaked information included 1.5 million API keys.
The founder himself had explained that the service was “vibe-coded without writing a single line of code.”
That same week in Japan, the first ever “Vibe Coding Certification” launched. Play became an institution, and damage became reality. Vibe coding has entered “2.0” — that’s how I see it.
In this article, as a continuation of the trilogy, I’ll lay out the turning point from “vibes” to “governance.”
The Moltbook Leak Reveals “The First Large-Scale Damage from Vibe Coding”

In February 2026, security firm Wiz published its investigation results on its official blog.
Moltbook is an SNS where AI agents post to each other. It was a service that had been gaining attention. Supabase is used for the backend. Supabase is a cloud service that bundles together database and authentication features. The problem is that the database had no access controls configured.
Here’s the damage Wiz’s investigation uncovered:
- 35,000 email addresses were viewable from the outside
- 1.5 million API keys were exposed
- Write permissions to post data were open, leaving the content open to tampering
The cause isn’t AI. The essence is that RLS (Row Level Security) wasn’t configured. RLS is a feature that restricts “who can see what” on a per-row basis in a database. Without it in place, anyone with the public key can pull the data.
Reuters has also reported on this. It’s no longer a story confined to a vendor blog.
What I want to focus on here is that this isn’t a case of “AI wrote bad code.” Humans failed to set up safe defaults. It looks like a textbook example of missing the “Spec Layer” principle from Forrester that I wrote about in the trilogy.
This is how it looks to me: “the feature works but security is an afterthought” is a failure pattern that’s been around for a long time. From my experience listening to thousands of user voices, I know it all too painfully well. Even in an era where AI writes code, this structure hasn’t changed.
Japan’s First “Vibe Coding Certification” Has Started — The Day Play Becomes an Institution
Andrej Karpathy coined the term “vibe coding” in 2025. One year later, a certification program has been launched in Japan.
In 2026, a certification measuring basic knowledge and practical skills in vibe coding has started.
The fact that “a certification has been created” means two things.
First, it means vibe coding has been recognized as a skill with a certain market scale. It’s starting to be treated not as a hobby or experiment, but as an ability you can put on your resume.
Second, the line between “the right way” and “the dangerous way” has been institutionalized. A certification means there are passing criteria. It means a judgment axis for “this is what you shouldn’t do” has been officially established.
Forrester has also released its 2026 prediction report (members-only, so please verify it yourself). The framing is that vibe coding will evolve into “vibe engineering.” We’re entering a stage where AI is integrated not just into code, but into the entire engineering process including design, testing, and operations.
Janet Worthington’s official Forrester blog is also worth reading. Drawing from her own experience using Cursor, she identifies three risks: insufficient input sanitization, lack of rate limiting, and plaintext API keys creeping in. “Secure Vibe Coding is not a paradox but a paradigm.” That phrasing stuck with me.
In my trilogy, I wrote “lock down the design with spec-first, then let AI write the code.” I have a sense that this content is starting to be backed up by institutions. “Let’s just build something that works” is my philosophy. But “safety after it works” requires a different muscle. The birth of certification also means a place to train that muscle has been created.
Why GitHub Updated Its Security Features Three Times in March 2026 Alone

The tooling side is also moving rapidly. GitHub rolled out security-related updates three times in March alone.
March 17: Secret scanning via the GitHub MCP Server went into public preview.
You can scan from AI coding agents before commits and before PRs. It’s a mechanism where, while writing code via Claude Code or Cursor, you can detect API keys mixing in before committing. Remember the 1.5 million leaks at Moltbook. If pre-commit scanning had been in place, there’s a good chance the API key exposure could have been prevented.
March 23: The scope of AI-powered detections was expanded. Shell, Bash, Dockerfile, Terraform (HCL), and PHP were added. The design supplements with AI the range that CodeQL alone couldn’t catch.
March 31: Nine new secret types were added to Secret Scanning. LangChain, Salesforce, and Figma are now covered.
Three times in one month. You can see the platform side is getting serious about putting up gates.
Another thing you can’t overlook is data from Veracode. The 2025 GenAI Code Security Report evaluated more than 100 LLMs (large language models). 55% of the code was safe. The remaining 45% contained known vulnerabilities.
There was an expectation that “as models get smarter, they’ll get safer.” In the Spring 2026 update as well, this trend hasn’t significantly improved.
The NCSC (UK National Cyber Security Centre) also offers a noteworthy framing in a document published in March. The point: “prompt injection is not SQL injection.”
SQL injection has “a boundary between instructions and data.” Input validation can prevent it. With LLMs, on the other hand, there is no hard boundary between instructions and data. Traditional security metaphors alone are not enough. This warning is a point that can’t be ignored in vibe coding safety design.
What “45% of Working Code Has Vulnerabilities” Really Means

Let me dig a bit deeper into the numbers.
There’s the SVIBES benchmark released by a Stanford-affiliated research team. These are the results from verification with a SWE-Agent + Claude 4 Sonnet configuration. The functional test pass rate was 61%. Yet, of those, only 10.5% were safe.
The meaning of these numbers is clear. “Working code” and “safe code” are different things. Even when the functional tests pass, only 1 in 6 clears the safety bar.
In the world of vibe coding, “it works for now” tends to be the first goal. I was the same way. Writing in Cursor, hitting Run, that rush the moment it works. “Whoa, it works!” — you literally say it out loud.
That said, the excitement of the moment it works is a separate matter from whether it can be operated safely.
The Constitutional Spec-Driven Development approach I introduced in the third installment of the trilogy is a useful reference. You embed a “constitution” based on CWE (Common Weakness Enumeration) or MITRE into the specification first. With this technique, security defects were reportedly reduced by 73%.
It’s not about writing working code and then making it safe. It’s about defining a safe specification first and then writing the code. This reversal of order is, I feel, the core of vibe coding 2.0.
The VibeGuard paper published in April is also interesting. Rather than focusing on the generated code itself, it proposes three pre-shipment gates: artifact hygiene, preventing packaging drift, and preventing source map exposure. The discussion is expanding from code quality to supply chain hygiene.
Three Workflows This Former Burned-Out Engineer Changed for “2.0”
The theoretical discussion has gone on for a while. From here, I’ll share what I actually changed after writing the trilogy.
1. Write the Spec first. Code comes later.
The old me would open Cursor and type “I want to build something like this” into the prompt. Something that works comes out. It’s fun. I’d just keep going.
Now it’s different. First, I write down “the conditions this feature must satisfy” as bullet points in a Markdown file. Range of input values, behavior on error, access permission design. It’s work that takes 5 minutes. But this 5 minutes prevents bugs that would later take 3 hours to fix.
It’s a simplified version of Forrester’s “Spec Layer.” My experience reviewing hundreds of requirement specifications in my previous job is paying off here.
2. Automate secret scanning with pre-commit hooks
The MCP Server-based secret scanning feature GitHub released in March. After learning about it, I installed pre-commit hooks in my local environment too. It’s a mechanism that automatically checks before you save your code.
# Example configuration for .pre-commit-config.yaml
# Detect secrets (API keys and passwords) before commits
repos:
- repo: https://github.com/gitleaks/gitleaks
rev: v8.22.0
hooks:
- id: gitleaks
Setup takes 5 minutes. The risk of accidentally committing an API key drops to nearly zero. After knowing about Moltbook, there’s no reason to begrudge those 5 minutes.
3. Give AI minimal permissions
The NCSC’s Secure AI System Development Guidelines are a useful reference. They lay out three principles when AI accesses external systems or updates files: least privilege, safe defaults, and opt-in for dangerous features.
When I build internal tools, I now explicitly limit the scope I leave to Claude Code. I tell it “you can only touch this directory” or “no writing to the database.” By handing over constraints first, the generated code naturally has a narrower scope.
It’s easier to let it run free. But the founder of Moltbook also “vibe-coded freely” and ended up overlooking the RLS setting. The balance between freedom and governance is hard without experience.
Honestly, at first it was a hassle. Write the Spec first, then code, set up hooks, restrict permissions. It felt like the more steps were added, the more the “vibe” got drained. But after sticking with it for a week, I actually feel more at ease. I’m more confident in what I’ve built than when I was just running things without thinking.
As someone with a CS background, I know the cost of responding to security incidents. It’s hundreds of times the cost of a 5-minute preparation. “5 minutes of prevention, 50 hours of incident response.” I learned this painfully well on the customer success front lines.

Vibe Coding 2.0 Lies Beyond “Being Able to Write”
Let me wrap up the discussion so far.
Vibe coding has entered “2.0” through three stages.
- Institutionalization: Japan’s first certification has launched. Forrester predicts evolution to “vibe engineering.” Play has become an official skill.
- Real damage: 1.5 million API keys leaked at Moltbook. It became the first large-scale case of a service built with vibe coding causing real-world damage.
- Acceleration of defenses: GitHub added security features three times in March alone. Veracode reports “45% of AI code has vulnerabilities.” VibeGuard raises the new issue of supply chain hygiene.
What I wrote in the trilogy was the theory of “this is how you make it safe.” What I wrote this time is “the reality where the theory wasn’t enough” and “the tools, institutions, and workflows that are starting to move.”
I, who once stepped away from code, came back through AI. That joy hasn’t changed. I genuinely think, “AI coding is seriously divine.”
But to safely use divine tools, “vibes” alone are no longer enough. Spec-first, minimum permissions, pre-commit gates. These three can be started in 5 minutes starting today.
The experience of having a top-tier engineer inhabit you. The governance to safely continue that — that, I believe, is the essence of vibe coding 2.0.

正直、一度エンジニアは諦めました。新卒で入った開発会社でバケモノみたいに優秀な人たちに囲まれて、「あ、私はこっち側じゃないな」って悟ったんです。その後はカスタマーサクセスに転向して10年。でもCursorとClaude Codeに出会って、全部変わりました。完璧なコードじゃなくていい。自分の仕事を自分で楽にするコードが書ければ、それでいいんですよ。週末はサウナで整いながら次に作るツールのこと考えてます。


