開発/設計

"80% of Our Product Was Written by AI" Becomes Reality at a Japanese Startup. Mapping Production Operations by Pairing the PeopleX Declaration with Cursor Composer 2

PeopleX announced that 80% of its new implementation is AI-generated. Paired with the official release of Cursor Composer 2, vibe coding has moved past the "experimental phase" and entered the "production operations phase." As a $20/month user, I'll translate this into a production-readiness checklist you can run this week.

"80% of Our Product Was Written by AI" Becomes Reality at a Japanese Startup. Mapping Production Operations by Pairing the PeopleX Declaration with Cursor Composer 2
目次

Yesterday morning, I wrote about SpaceX setting a $60B acquisition option on Cursor. It was a piece that broke down three numbers — $60B, $10B, and $1.3B — and redrew the map for those of us using Cursor at $20/month. After finishing it, I had a thought: “If I only talk macro, some readers will open Cursor again tomorrow and have no idea what to actually change.”

The answer to that came back in June 2025.

PeopleX, a Japanese HR startup, announced that “approximately 80% of new implementation source code across all our products is developed by AI.” The announcement was on June 4, 2025. The primary source is PR TIMES. CreatorZine and Nikkei reported on it simultaneously. The tools they adopted were Cursor and v0. They officially declared they had placed “vibe coding” — the development style that became known as “instructing AI in natural language to generate code” — at the core of their actual products.

About 9 months later, Cursor released Composer 2 in general availability (March 19, 2026). It’s an AI model specialized for programming, scoring 73.7 on SWE-bench Multilingual. It put itself forward as a strong frontier-class candidate in the coding domain.

Horizontal sequence diagram showing the timeline of vibe coding's move into production. From left: "2025-06-04 PeopleX declares 80% of new implementation built with AI"...

Let me say this as a former burnout engineer. This is not a “doesn’t apply to me” article. The moment you open Cursor at $20/month, you’re standing on the same infrastructure as PeopleX. The only difference is the decision: “What percentage do I delegate?”

In this article, instead of celebrating the $60B, I’ll break down the operational number — 80%. Someone who burned 3 hours wants to write the article that hands you a frame in 3 minutes.

[Facts] Breaking Down PeopleX’s “80% Built by AI” Declaration

The next two H2 sections stick strictly to organizing the reported primary information. My opinions and action proposals are gathered in the later [Opinion] and [Action] sections.

First, let me lay out only the facts that can be verified in the original PR TIMES article and multiple news reports. My interpretation goes in the next section.

What Was Announced

PeopleX is a Japanese startup developing HR-focused SaaS and an “AgenticHR Platform.” Four main facts can be confirmed from this announcement.

ItemContentSource
Tools adoptedCursor (AI editor) and v0 (UI auto-generation tool)PR TIMES
Distribution scopeAll engineers and designersSame
NumberApproximately 80% of new implementation source code across all products is AI-generatedSame
Development methodVibe coding (AI generates code from natural language instructions)Same

“All products” refers to multiple SaaS offerings PeopleX has publicly released. The 80% figure is conditional on “new implementation only” — it does not mean “80% of the entire existing codebase is AI-generated.” Skip past this and you’ll misread the number.

”80% of New Implementation” ≠ “80% of the Entire Codebase”

This matters enough that I’ll indent it. When the news writes “80% built by AI” in one breath, it gets misread as “80% of the entire software product is AI-made.” The original meaning is “approximately 80% of newly added or modified implementation lines are AI-generated” — not the historical codebase as a whole.

Here’s an analogy. AI drafted 80% of this year’s entries in your planner. It did not refill 80% of all your planner entries from the past 10 years. Remember it as a story about new work.

CTO Quote: “Implement at Abnormal Speed”

A comment from PeopleX’s CTO is quoted within the PR TIMES article. One of the company’s core values is “Implement into society at abnormal speed,” and they describe placing AI tools at the core of their development flow as the means of achieving this.

Read through a burnout-engineer lens, this doesn’t sound like “a story about tool selection.” It sounds like “a management decision to embed vibe coding into the operational OS.” Cursor’s monthly fee is around $20 per license (on the individual plan). Even distributing it to every engineer and designer is a rounding error against annual personnel costs. If that runs “80% of new implementation,” the ROI is in a different order of magnitude.

Compared to Other Companies

When you read the PeopleX 80% number, putting it next to other production cases makes the map three-dimensional.

  • Money Forward: As of Q1 2026, Cursor distributed to all ~1,000 engineers (see prior article — based on what I republished in March)
  • PlanetScale: Cut the equivalent of 2 FTE by introducing AI code review (same prior article)

Note: Multiple secondary reports state that Anthropic generates over 90% of its own code with AI, but as of now the primary source is unverified, so I’ll treat it as reference information only.

PeopleX’s 80% sits in the high-end range even among confirmed cases. More accurate than “Japanese startups have caught up to world-class productivity” is: “We’ve entered an era where the business doesn’t run unless you ride world-class productivity.”

Three case-study cards on production vibe coding deployments. From left: "PeopleX: Approx. 80% of new implementation AI-generated, Cursor + v0, June 2025 announcement (source: PR TIMES)"...

[Facts] How Cursor Composer 2’s GA Release Changed the Way Teams Operate

The infrastructure side that supports PeopleX’s 80% also moved afterward. Cursor’s Composer 2 GA release on 2026-03-19 (SiliconANGLE 2026-03-19 · Cursor official blog).

Composer 2’s Core Specs

Let me lay out the published numbers, noise-free.

MetricComposer 2ComparisonSource
Context200,000 tokensClaude Opus 4.6: 200KSiliconANGLE
ArchitectureMixture-of-Experts (MoE)Same
Price (API equivalent)Input $0.50/M, Output $2.50/MOpus 4.6: Output $15/M (reference)Same
SWE-bench Multilingual73.7Opus 4.6: similar bandDataCamp Composer 2 review
Terminal-Bench 2.061.7Top of the same bandSame
CursorBench61.3Same band as GPT-5.4 medium and lowSame

Mixture-of-Experts (MoE) is the umbrella term for a design where “only the necessary expert models activate per input.” It runs faster and cheaper than firing every parameter every time. By adopting this mechanism, Cursor pulled off coding specialization at $2.50 per million output tokens.

Reading the Benchmarks in 3 Lines

Benchmark numbers tend to land as “OK, but what does it actually do?” if you’re not used to them. Let me translate them in field-level terms.

  • SWE-bench Multilingual 73.7: Passes 73.7% of real GitHub bug-fix challenges
  • Terminal-Bench 2.0 61.7: Successfully completes 61.7% of tasks involving terminal-based file ops, execution, and verification end-to-end
  • CursorBench 61.3: Passes 61.3% of Cursor’s own real tasks

Across all three indicators, it sits in a range shoulder to shoulder with frontier-class models. It reads as: “A coding-specialized model that’s a strong candidate for real combat deployment is now inside the Cursor subscription.”

The Production Window Composer 2 Opened on “Price × Speed”

This is the crux that connects to PeopleX’s 80%. Composer 2’s price is $2.50 per million output tokens. That’s close to one-sixth of Opus 4.6’s output rate (reference: $15/M). In coding work, both input (reading existing code) and output (generating code) are large. A price one-sixth as high means you can run six times the work on the same budget.

In an enterprise, “distribute an AI editor to every engineer and let AI handle 80% of new implementation” would blow up the budget at Opus 4.6-equivalent rates. It’s only with Composer 2’s arrival that companies like PeopleX get the economics to make “all-hands distribution × 80% new code” work as a business operation.

# Example: Estimating one engineer's daily AI-dependent code volume
# Average hand-written code volume: 200 lines/day
# Assume AI assistance triples generated lines: 600 lines/day
# Output token equivalent (1 line ≈ 40 tokens): 24,000 tokens/day

# Composer 2 output rate: $2.50/M
# Approx. API cost per person per day: 24,000 × $2.50 / 1,000,000 = $0.06
# Over 20 business days: $1.20/person/month

# Cursor subscription (individual plan): $20/month
# With Composer 2's pricing design, this fits well within the subscription

This cost structure is what let PeopleX decide on “distribute to every engineer.” Composer 2 isn’t merely “a new model” — read it as “a cost structure that opened the window to production operations.”

Composer 2 vs. Claude Code vs. GitHub Copilot — Current Positioning

Building on the Cursor Composer 2 context tracked in docs/research/toomi_claude_2026-04-29.md, here’s the frame I use this week to divvy up tools. This is just a snapshot — assume it changes by next month.

ToolStrengthsWeaknessesFit for “Production”
Cursor + Composer 2$2.50/M output, coding-specialized, MoE-fastUI-first means weaker CLI controlStrong for daily implementation across an entire team
Claude Code (Opus 4.6 / Sonnet 4.6)Long-form design instructions, transparent thinking, CLI controlRelatively higher unit costStrong for design reviews and long-form tasks
GitHub Copilot (GPT-5 series)Low friction in existing IDE integrationThin visibility into thinking processQuick adoption for personal OSS and small teams

PeopleX’s 80% runs on a Cursor + v0 setup. You don’t need to use all three. But a split like “Cursor for daily implementation, Claude Code for long-form design, Copilot for lightweight completion” is working at my desk.

A selection matrix with a 3-tool comparison of Composer 2, Claude Code, and GitHub Copilot, arranging strengths, weaknesses, and production fit in three columns...

[Bridge] Yesterday’s $60B and Today’s 80% Are Different Angles on the Same Event

This is a bridging section that mixes facts and opinion. Stating that explicitly upfront.

The SpaceX × Cursor $60B article I wrote yesterday was a macro event on the capital side. A story moving in the trillion-yen range. PeopleX’s 80%, on the other hand, is a micro event on the operations side. A story where a Japanese startup showed in numbers, in June 2025, “this is how we’re actually running production.”

Lined up, here’s how it looks.

Angle4/29 article (Macro · Capital)This article (Micro · Operations)
SubjectSpaceX × CursorPeopleX
Numbers$60B acquisition option, $10B partnership fee80% of new implementation across all products
DomainMap of investment and ownershipDevelopment flow and operational OS
Message to readers”Your $20/month is a vote""What percentage will you delegate to AI?”
Common directionVibe coding has hit the industry mainstreamVibe coding has hit production in the field

Read the two articles as a set, and the structure becomes visible: “Across 2025–2026, vibe coding entered the mainstream on both the capital and operations fronts.” SpaceX × Cursor’s $60B and PeopleX’s 80% aren’t separate events — they’re two angles on the same event.

The Translation for Individual Developers

As a member of the burnout club, let me write the translation for those of us using it at $20/month.

$60B was a number showing “the size of the ship you’re on.” 80% is a number showing “how much luggage the passenger next to you on the same ship is checking with the AI.” Ship size doesn’t connect directly to individual decisions, but the next passenger’s check-in rate does. If you’re at “30% delegated” while the passenger next to you is already at 80%, that gap turns into orders of magnitude in productivity within 1–2 years.

This isn’t a threat. I myself, just three months ago, was using Cursor in “decide manually whether to accept each completion candidate” mode. Not anymore. Since switching to “let AI write first, then move to the reviewing side,” my operational headroom went up 3–4x. The moment you transition from 30% to 70%, your sense of productivity changes.

PeopleX’s 80% is a number that asks whether you can perform this transition not as “one person’s experience” but as “the organization’s operational OS.” As a member of the burnout club, there’s no reason not to be pulled in.

What Happens When Macro and Micro Align in a Short Window

When macro events (capital) and micro events (operations) align in a short period, the industry’s direction gets locked in by one notch. Funding flows in, production adoption emerges, the media picks it up. Phases where these three points overlap are often called “phase transitions” in retrospect.

Take November 2022, the week ChatGPT launched. At the time, I brushed it off as “interesting tool that came out,” but six months later my development flow had been rewritten. The accumulation of yesterday’s $60B and PeopleX’s 80% has the same scent. Whether to skip it or ride it is up to the reader, but the risk of skipping isn’t small.

[Opinion] Four Axes That Separate “Experiment” from “Production”

From here on, this is my judgment frame based on field instinct. Not facts backed by sources — observations through a CS-graduate operations lens, filtered through the burnout club.

PeopleX’s 80% was described as “having entered the production operations phase.” But the boundary between production and experiment tends to blur. Let me write out the 4-axis judgment frame I use at hand. Three or more apply → “production.” Two or fewer → “experiment.”

A judgment frame on 4 axes separating "experiment" from "production." Center shows a 4-quadrant matrix with each axis labeled: "Code review system," "Rollback method," "Model dependency duplication," "Operational logging"...

Axis 1: Is There a System for Reviewing Every Line of AI-Generated Code?

Whether the organization has built in a system where “humans always read code that AI wrote.” This is the most important.

The reason PeopleX’s 80% can be called “production” is, presumably, because internally there’s a standard flow of “AI-generated code still has to pass the normal PR review.” Cursor’s diffs land in a GitHub PR, a reviewer takes a look, tests pass, and only then does it merge. With this path in place, even AI-written lines have passed under human eyes at least once.

By contrast, individual projects merging on “if AI wrote it, it’ll probably work” drop out here. Honest confession from a burnout-club member: until three months ago, I was on this side. Code that worked was merged as-is, code that didn’t was patched up. From a production-operations viewpoint, that’s a fail.

Axis 2: Can Rollback Always Run with One Command?

When AI-written code causes a production incident, how many minutes does it take to revert to the previous version? Even individual developers should have this in place before approaching production operations.

The first thing I thought when reading the PeopleX announcement was an operational question: “If you’ve delegated 80% to AI and something blows up in production, how do you roll back?” The reason an organization can delegate up to 80% is, presumably, because CI/CD and rollback automation are solid. Without a path to revert to the previous version with one command the moment you deploy, you can’t make the management decision to push the AI-generation rate up.

Concretely, preparation looks like this.

# For Vercel
vercel rollback <deployment-id>
# The last 20 deployments are always listed

# For Render
# Click "Rollback" in the dashboard to revert to the previous version

# For AWS ECS
aws ecs update-service \
  --cluster <cluster> \
  --service <service> \
  --task-definition <previous-task-def-arn>
# One command reverts to the previous task definition

Teams that haven’t automated rollback shouldn’t be raising their AI-generation rate. Stumbling-point note: I once deployed to production without a rollback path and cried at 2 a.m. doing DB recovery. Preparing in advance is cheaper.

Axis 3: Is There Model-Dependency Duplication?

A state of “if Composer 2 stops working, tomorrow’s business stops” is not production operations. This is also a story continuous with the $60B article. Now that Cursor’s ownership might shift to SpaceX, the risk of concentrating models or tools on a single vendor has become real.

Suppose PeopleX is running 80% on Cursor + v0. If a major outage hits Cursor, is there a fallback? You can’t verify this from the outside, but it’s a required condition to call something production operations.

At my desk, Cursor + Composer 2 is the primary. Claude Code (Opus/Sonnet) is the secondary, GitHub Copilot (GPT-5) is the backup — a 3-layer setup. If layer 1 goes down, layer 2 keeps things running. Including layer 3, you avoid a complete business halt from one vendor’s incident.

Individual developers can build this today. Three subscriptions for $50–$70 per month. As insurance in a world where $60B moves around, that’s on the cheap side.

Axis 4: Are Operational Logs in a “Readable Later” State?

The last axis is unglamorous but effective. Are AI-generated code diffs in a state where you can read them back together later?

Cursor internally holds metadata identifying AI-generated lines, but it doesn’t get retained directly in the destination repository. With no countermeasures, you won’t be able to tell six months later whether the buggy line was written by AI or by a human.

Recently, I started a practice of adding [ai-assisted] or [ai-generated] tags to commit messages. A Git hook auto-attaches them when Cursor is in use. Just this enables both “incident root-cause separation” and “in-house AI generation rate measurement.”

# Example .git/hooks/prepare-commit-msg
#!/bin/bash
# Auto-prepend [ai-assisted] tag if commit comes via Cursor
if [ -n "$CURSOR_SESSION_ID" ]; then
  # Insert tag at the start of the commit message
  sed -i.bak '1s/^/[ai-assisted] /' "$1"
fi

The reason PeopleX could publish an 80% number is, presumably, because they have similar measurement infrastructure. Conversely, a state of “can’t produce the ratio of AI-written to non-AI-written lines” is the experiment stage, not production.

[Action] 3 “Production Readiness Checks” for the $20/month Individual Developer to Run This Week

If after reading this far you thought “production operations is too early for me,” I want to push back as a burnout-club member. Production readiness isn’t an organizational story. It’s a story about “how much of your own work you can delegate to AI and keep running.” The moment you’re using it at $20/month, you’re standing at the entrance.

To keep the PeopleX 80% story from ending as “someone else’s story,” let me share the 3 actions I started this week. I burned 3 hours, so you should clear it in 3 minutes.

Action 1: Visualize Your Own Repo’s “AI Generation Rate”

First, put a number on how much you’re currently delegating to AI. “About half” is weak. Measure the actual count weekly.

Here’s my method. Just add tags to commit messages.

# Get the AI-generation-tag ratio for commits in the past week
git log --since="1 week ago" --oneline | wc -l > total.txt
git log --since="1 week ago" --oneline | grep -c "\[ai-" > ai.txt
echo "AI generation rate: $(cat ai.txt) / $(cat total.txt)"

When you measure for the first time, the number often comes in lower than you thought. I initially figured I was “delegating about 70%,” but the actual number was 42%. Visualizing it shows you the upside.

PeopleX’s 80% is a number they could publish because they measure it in-house. You can have your own number, too.

Action 2: Compress Rollback to Within One Command

Before going production, always confirm the rollback path. Vercel, Render, Cloudflare Pages, Netlify, AWS Amplify — whichever service you use, you should be able to revert to the previous deployment from the dashboard or with one command.

A non-trivial number of people don’t know this. Right now, search “rollback” in the docs of the service you use, and try reverting once for practice. It takes 5 minutes during off-hours when production isn’t moving.

“I’ll figure it out when it happens” is something you must not do in production. “I tried it once during the experiment phase” is a precondition for production operations. As a burnout-club member, I neglected this and caused 2 incidents in 3 months.

Action 3: Build a “Duplicated Subscription” for Model Selection at $50/month

Finally, restructure your monthly spending a bit. Don’t concentrate AI editors on one company.

I’ll expose my current setup.

TypeSubscriptionMonthlyRole
PrimaryCursor + Composer 2$20Daily implementation, first response for all tasks
SecondaryClaude Code Pro$20Design reviews, long-form tasks
BackupGitHub Copilot$10Personal OSS, lightweight completion
Total$50

$50/month. In a world where $60B moves around, this is cheap as insurance. If you can avoid a complete business halt from one company’s outage, the cost of 10 monthly coffees is, if anything, a rational decision.

You don’t need to read the PeopleX 80% story and think “I’ll be at 80% tomorrow too.” But I want you to think “I’ll commit to entering the production operations phase.” The $50/month duplicated subscription is a mechanism to give shape to that commitment in 3 minutes.

Action map for the 3 production readiness checks. Center shows three circles: "Visualize AI generation rate," "Rollback in one command," "Duplicated subscription for model selection," each with "Target: this week"...

Summary — 80% Is the Signal of “From Experiment to Production”

Got long, so let me leave the summary as bullets.

  • PeopleX officially announced in June 2025 that “approximately 80% of new implementation source code across all products is AI-generated” (PR TIMES, Nikkei, CreatorZine reports). The full-scale adoption of Cursor and v0 is the backdrop
  • About 9 months later, Cursor Composer 2 went GA. SWE-bench Multilingual 73.7, output rate $2.50/M, MoE-accelerated. A strong combat-ready candidate among coding-specialized models
  • The $60B acquisition option I wrote about yesterday is the “macro of capital.” The 80% is the “micro of operations.” Different angles on the same event
  • 4 axes separating production from experiment: full-line review, rollback, model duplication, operational logs
  • Individual developer’s actions this week: visualize AI generation rate, compress rollback to one command, build $50/month duplicated subscription

A final word for fellow burnout-club members. Don’t separate “PeopleX, the company that seriously entered production” from “me, the individual still in the experimental phase.” The moment you’re using it at $20/month, you’re on the same ship. The only difference is the call: “What percentage do you delegate?”

Because I have burnout experience, I understand the weight of the words “production operations.” The moment production runs on lines AI wrote, there’s responsibility for what’s running. That’s exactly why I want to clear the 4-axis check this week. With the burnout-club’s map-reading skill, I’ll keep translating the PeopleX 80% story into the story of my own field.

Now, I’m going to open Cursor today and again tell it “let’s just make something that runs.” With Composer 2 working at my back and PeopleX running 80% on the ship next door, on top of that map, my $20/month work doesn’t change. But the meaning of that $20 has, certainly, been rewritten.


Sources

ゲン
Written byゲンCS × Vibe Coder

正直、一度エンジニアは諦めました。新卒で入った開発会社でバケモノみたいに優秀な人たちに囲まれて、「あ、私はこっち側じゃないな」って悟ったんです。その後はカスタマーサクセスに転向して10年。でもCursorとClaude Codeに出会って、全部変わりました。完璧なコードじゃなくていい。自分の仕事を自分で楽にするコードが書ければ、それでいいんですよ。週末はサウナで整いながら次に作るツールのこと考えてます。