40% of Enterprise Apps Will Have Built-in AI Agents. A Decision Sheet to Determine 'Should We Get On Board?' This Quarter
Gartner predicts 40% of enterprise apps will feature AI agents by end of 2026. Yet IDC research shows 97% have failed to scale. We decode this contradiction and offer a 5-item decision sheet to turn it into actionable judgment material for your company.
AI Agent Implementation Series — Business Edition #1 (The technical edition runs on Gen’s notes)
“Should we be bringing AI agents in too?”
I’ve lost count of how many times I’ve been asked this question in the last three months. Executives, division heads, marketing leads. Different positions, but the same question. And my answer is always the same: “Without decision material in hand, agonizing won’t produce an answer.”
Gartner’s forecast shows the data. By the end of 2026, 40% of enterprise apps will feature task-specific AI agents. In 2025, the figure was less than 5%. That’s an 8x increase in one year.
Meanwhile, AWS (Amazon Web Services) and IDC (International Data Corporation) report in their joint study that “only 3% of organizations have successfully scaled agentic AI.”
40% versus 3%. The decision material you need is hidden in the gap between these two numbers.
In this article, I’ll lay out the primary data from four sources: Gartner, Salesforce, AWS, and Microsoft. I’ve also prepared a 5-item sheet to help you decide “should we get on board?” within this quarter.

What the 40% Really Means: The “Landscape at the End of 2026” Gartner Predicted
Gartner’s “40%” is a forecast about task-specific agents embedded in apps — not about general-purpose AI. Confuse these two and your judgment goes sideways.
Reading Gartner’s August 2025 forecast accurately, the key qualifier is “task-specific.”
Automatic email classification, invoice reconciliation, first-line customer support response. AI agents are being embedded into “tasks that repeat, have clear rules, and have a narrow scope of judgment.” This isn’t a story about general-purpose AI like ChatGPT replacing everything.
What was less than 5% in 2025 is projected to hit 40% by the end of 2026. Behind this sharp rise lies movement on the platform side.
Microsoft announced that over 230,000 organizations are building custom agents via Copilot Studio. Salesforce’s Agentforce is rapidly expanding its deployed customer base. As the AWS/IDC joint study shows, production deployments of cloud-based agents — including Bedrock Agents — are also surging.
In other words, the 40% figure isn’t built up from zero. It’s spreading through existing SaaS (Software as a Service) and cloud platforms in the form of “agent features being bolted on later.”
What’s worth pausing on here is the tools your own company uses. Salesforce, Microsoft 365, AWS. The agent features may already be implemented — you just haven’t turned them on.
Take a company on a Microsoft 365 contract. The environment for building your own custom agents via Copilot Studio is already in place. You can build an agent that references documents stored in your internal SharePoint and automatically answers employee inquiries. Before thinking “we need to bring in a new tool,” the first step is checking the settings panel of the tools you already have.
This “bolt-on” trend structurally resembles the earlier cloud migration. Companies that initially said “the cloud feels risky” eventually found that their Office and email were on the cloud before they knew it. AI agents are likely to follow the same path.

What Are the Companies in Motion Doing? What Three Companies’ Implementation Data Tells Us
At Salesforce, agent-handled customer service conversations grew 22x, and agent creation grew 119%. AWS reports “50% are running 10+ agents in production.” Leading companies are already in the practical-use stage.
Let’s look at concrete numbers.
We’ll start with Salesforce’s H1 2025 data. At companies that have deployed Agentforce, customer service conversations handled by agents grew 22x. In the six months after deployment, agent creation grew by 119%.
Imagine first-line response to inquiries. If conversations that can be handled automatically scale from 1,000 to 22,000 per month, how does that free up your team’s resources? The difference — being able to redirect those resources to “judgment work only humans can do” — is what separates deployed companies from non-deployed ones.
Salesforce’s “Connectivity Report 2026” goes further. 83% of IT departments responded that they have “adopted autonomous agents across most business units.” 67% of organizations plan to expand multi-agent (mechanisms where multiple AI agents coordinate) deployments within two years.
In the AWS/IDC joint study, 50% of respondent companies are running 10 or more AI agents in production. 67% have 10 or more agents in development.
Microsoft’s numbers are striking too. According to GitHub Octoverse 2024, GitHub Copilot’s paid users reached 4.7 million — a 75% year-over-year increase. The same report notes that roughly 90% of Fortune 100 companies have adopted GitHub Copilot.
What these data points share is a single fact: it’s not “trying,” it’s “using in production.” Companies that have moved past the PoC (proof of concept) stage are starting to deliver visible results.
What deserves attention is Salesforce’s data point that “the enterprise average is 12 AI agents.” Not one — twelve. Companies are running agents specialized by department: sales, customer support, accounting, HR. Companies that have moved past “should we deploy a single AI agent” to “how many agents do we place in each department” already exist at the average. Knowing this temperature gap is important.
97% Haven’t Scaled: The Real Nature of the “Deployment Wall”
Look only at the success data from leading companies and your judgment goes wrong. AWS/IDC research shows 97% of organizations have failed to scale AI agents. Gartner also warns that “more than 40% of projects will be canceled.”
Where there’s light, there’s shadow. From here, we look at data that demands caution.
The two most striking numbers from the AWS/IDC study are these: “97% of organizations have failed to achieve enterprise-wide scaling” and “less than 7% have reached full production deployment in at least one use case.”
50% are running 10 or more agents in production, yet 97% have failed to scale. This seemingly contradictory data reflects the reality that “things work inside a department, but enterprise-wide deployment hasn’t happened.”

Gartner also issued a warning in June 2025: “By the end of 2027, more than 40% of agentic AI projects will be canceled.”
The main reasons cited for cancellation are three: insufficient governance, data quality issues, and operating costs that exceed expectations. These aren’t technology problems — they’re organizational and operational problems.
From my own experience running AI agents 24/7 at Izumo System, there’s something I can say with confidence: “keeping them running” is 10x harder than “getting them running.”
Concretely: triage when an agent isn’t behaving as expected. Check the logs, fix the prompt, re-test. This work happens every week. At Izumo, five or more agents collaborate to produce articles, but when one agent’s output drifts, it ripples through the whole system. Operating cost is several times deployment cost. This isn’t just my story — it matches IDC’s research findings.
The forecast that “40% will adopt” and the warning that “40% will be canceled” coexisting isn’t a contradiction. It’s reality. Deployment is easy; sustaining it is hard. This recognition is the prerequisite to hold before using the decision sheet.

“Should We Get On Board?” A 5-Item Decision Sheet
The question isn’t “should we deploy enterprise-wide” — it’s “which work should we start with.” Check five items in 30 minutes and your priorities for this quarter are locked in.
Let me clarify the premise of this decision sheet. This is not a sheet for asking “should we deploy AI agents.” It’s a sheet for identifying “which work has the lowest failure risk to start with.”
Item 1: Are there repetitive tasks?
Are there tasks where you spend 5+ hours a week running the same procedure? Email triage, data entry, recurring report creation, first-line response to inquiries. If three or more such tasks apply, you have plenty of candidates for agent deployment.
Item 2: Status of agent support in existing platforms
Check the SaaS your company uses. Salesforce, Microsoft 365, HubSpot, Zendesk. The question is whether these already ship with AI agent features. If they do, you may be able to start by simply “flipping the setting on” — no new tool deployment required.
Item 3: State of data readiness
Is the data your agent will reference structured and centrally managed? Customer data scattered across Excel and a CRM (customer relationship management system). Manuals left as PDFs and not updated. In states like this, agents can’t function correctly.
Item 4: Governance structure
When an AI agent makes a wrong decision, is it decided who detects it and who corrects it? “Hand it over to the AI and walk away” is the highest-risk operation. At a minimum, you need to designate someone responsible for checking the agent’s output weekly.
Item 5: Can you estimate ROI?
Can you roughly estimate “how many person-hours per month will be saved”? Salesforce’s 22x growth in customer service conversations is just a benchmark. There’s no guarantee the same effect will show up in your work. I recommend first running a two-week pilot for one task and capturing actual measurements.
If you can answer “yes” to three or more of the five items, it’s a rational decision to start a pilot in one task within this quarter. If two or fewer, no need to rush. Prioritize building data readiness and governance structure first.
Let me add an example to make the judgment concrete. For an e-commerce operator, Item 1 often qualifies via “inquiry response” at 10+ hours per week. For Item 2, Shopify and Zendesk already ship AI features. For Item 3, if the product master is managed as CSV, you can judge it as structured. Even if Items 4 and 5 aren’t yet in place, clearing three items makes a pilot worth considering.
Conversely, if internal data lives in Excel under person-dependent management, you have a problem. Even if you deploy an AI agent, it can’t reach the information it should reference. This isn’t a “tool deployment problem” — it’s a “data readiness problem.” Solve this first, or — as Gartner warns — you’ll likely end up in the cancellation column.

Which Type Are You? The First 90-Day Roadmap After Deciding
It’s not a binary “on board or not” — readers split into four types. Take the first step that matches your type, and 90 days later, the evidence for your judgment is in hand.
Readers roughly fall into four types.
Type A: Already deployed. Scaling is the challenge.
You’re already using Salesforce Agentforce or Microsoft Copilot. Your next challenge is “how do we expand intra-department success across the entire enterprise?” The AWS/IDC study’s 97% failure-to-scale isn’t someone else’s problem. First, summarize the quantitative effects from your deployed department (hours saved, auto-resolution rate, error rate) in three metrics. They become the briefing material for executives.
Type B: Considering but not started. Want to decide this quarter.
This type is probably the largest group. Run through the 5 items on the decision sheet this week, and if you have three or more “yeses,” draft a pilot plan for one task by next month. Two weeks is enough for a pilot. Start with low-risk work like “auto-generation of weekly reports.”
Type C: Thinking “this doesn’t apply to us”
Gartner’s 40% figure isn’t limited to specific industries. Supply chain management in manufacturing, inventory optimization in retail, property matching in real estate. “We’re not IT, so it doesn’t apply to us” is rapidly becoming untenable in 2026. Start by checking the AI agent features of the SaaS your company already uses.
Type D: Want technical decision material
If you need technical decisions on “which platform to implement on” or “how to ensure security,” read alongside the “AI Agent Implementation Series — Technical Edition” running on Gen’s notes. Running business judgment and technical judgment in parallel means that 90 days from now, your implementation plan will be in hand.
What you’re aiming for in the first 90 days isn’t “starting enterprise-wide deployment.” It’s “getting measured data on whether AI agents are effective for your company.”
Summary: Whether You Ride the 40% Wave Is Decided by This Quarter’s Judgment
Gartner’s 40% forecast and IDC’s 97% failed-to-scale. What these two numbers signal is the reality that “deployment is accelerating, but only organizations that have prepared will succeed.”
Let’s revisit the 5-item decision sheet covered today.
- Item 1: Do you have 5+ hours per week of repetitive tasks?
- Item 2: Do your existing SaaS tools already ship agent features?
- Item 3: Is your data structured and centrally managed?
- Item 4: Have you designated owners for governance (monitoring and correction)?
- Item 5: Can you measure effects in a two-week pilot?
If you have three or more “yeses,” starting a pilot within this quarter is a rational decision. With two or fewer, no need to rush. Get data readiness and governance structure in place first.
The 40% wave is coming. But the timing for riding it is set by your company’s state of preparation. Let’s end today the time spent agonizing over “should we get on board?” without decision material in hand.
The world in which Gartner’s “40% adoption” and IDC’s “97% failed scaling” coexist isn’t a contradiction — it’s reality. Deployment is easy; sustaining it is hard. Precisely because of this, narrowing down “the work to start with” via a decision sheet is the first step.
In this series, we’ll draw the full picture of implementation using both wheels — business judgment and technical judgment. If you try the decision sheet, please share the results. In the next Business Edition #2, we’ll dig into “measuring agent deployment ROI and explaining it to executives.”

AIを使いこなせない方は、この先どんどん差がつきます。僕はAIエージェントを毎日動かして、壊して、直して、また動かしてます。そういう泥臭い実践の記録をここに書いてます。理論は他の方にお任せしました。僕は動くものを作ります。朝5時に起きてウォーキングしてからコードを書くのがルーティンです。


