📊 Full opportunity report: The Model Is Only 10%: The Real Lesson of the New SDLC on ThorstenMeyerAI.com — validation score, market gap, and execution plan.

TL;DR

A recent Google whitepaper reveals that in AI-driven software development, the model itself accounts for only 10% of system behavior. The focus should be on harness and context engineering, which constitute the remaining 90%. This shift has significant implications for how organizations approach AI integration.

A new whitepaper by Google, authored by Addy Osmani, Shubham Saboo, and Sokratis Kartakis, states that the AI model accounts for only about 10% of system behavior in AI-driven development. The paper emphasizes that harness and context engineering are the primary factors influencing AI performance, marking a significant shift in how organizations should approach AI integration.

The whitepaper challenges the common perception that the AI model is the most critical component, asserting instead that the surrounding infrastructure—prompts, rules, tools, and observability—comprises approximately 90% of the system’s effectiveness. Evidence cited includes experiments where tweaking only the harness or context led to substantial performance improvements, despite using the same core model.

According to the authors, this insight redefines the strategic focus for AI teams, suggesting that investments should prioritize building robust harnesses and developing skills in context engineering. This approach can lead to lower total costs of ownership and better system reliability, as opposed to chasing the latest model improvements.

At a glance
reportWhen: published March 2026
The developmentGoogle’s new whitepaper highlights that the core of effective AI systems lies in harness and context engineering, not just the AI models themselves.
The Model Is Only 10% — The New SDLC With Vibe Coding
AI Dispatch · Field Notes
Google · Osmani, Saboo & Kartakis · May 2026

The model is only 10%

A Google whitepaper argues software’s biggest shift is from writing code to expressing intent. Its sharpest claim: the model you obsess over is the smallest part of the system — the scaffolding around it does the real work.

A spectrum, not a binary — the differentiator is how outputs get verified
Vibe Coding
Casual prompts · “does it seem to work?” · disposable code · high risk
Structured AI-Assisted
Detailed prompts + constraints · manual testing · features in real codebases
Agentic Engineering
Formal specs · automated tests + evals + CI gates · production scale · low risk
Tests verify the deterministic; evals verify the rest. Without both, it’s vibe coding — however clever the prompt.
The idea worth building your strategy around
Agent = Model + Harness
~10%
HARNESS — prompts · tools · context · hooks · sandboxes · observability
MODEL~90% IS YOUR SURFACE AREA, NOT THE PROVIDER’S
Outside Top 30 → Top 5 on Terminal Bench 2.0 by changing only the harness — same model.
“Most agent failures, examined honestly, are configuration failures” — a missing tool, a vague rule, a noisy context.
The economics: it’s a token-cost problem (CapEx vs OpEx)
Vibe Coding
Low CapEx · High OpEx
Looks free, hides debt: token burn (fix-it loops), maintenance tax (AI spaghetti), security remediation. Crosses over to 3–10× more per feature.
Agentic Engineering
High CapEx · Low OpEx
Pay upfront (specs, evals, context), then ship cheaply. Levers: context engineering for first-pass success + intelligent model routing — cheap models for the easy work.
85%
of devs use AI coding agents (51% daily)
41%
of all new code is AI-generated
~90%
of agent behavior is the harness, not the model
+19%
longer on some tasks (METR) — verification is the cost
The read

The clearest map yet of how serious AI development works — and mostly tool-agnostic. But it’s a Google funnel: the concepts are neutral, the on-ramps point to Gemini, Jules & the ADK. If the harness is 90% and it’s yours, your moat and your costs both live there — so own your scaffolding, route across models, and remember: AI amplifies whatever engineering culture it lands in.

Source: Osmani, Saboo & Kartakis, “The New SDLC With Vibe Coding,” Google (May 2026). Figures are the paper’s own, incl. METR & LangChain. Analysis is the author’s.
thorstenmeyerai.com

Implications for AI Development Strategies

This shift means organizations should reallocate resources from chasing newer, larger models to refining their harnesses and context management. It emphasizes that the durability and effectiveness of AI systems depend more on configuration, tooling, and engineering practices than on the raw model size or capability. This insight can lead to cost savings, improved security, and more reliable AI deployment in production environments.

Amazon

AI system observability tools

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Evolution of AI System Design and Industry Perspectives

Earlier in 2026, AI development was often focused on acquiring larger models, with the belief that bigger models automatically yield better results. However, recent experiments and industry reports, including this whitepaper, highlight that the performance gap is often due to how the models are integrated and managed. The paper builds on prior discussions about the importance of verification, testing, and structured workflows in AI engineering.

“The AI model accounts for only about 10% of system behavior; the rest is in harness and context engineering.”

— Addy Osmani

AI Engineering: Building Applications with Foundation Models

AI Engineering: Building Applications with Foundation Models

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Unresolved Questions About Implementation and Cost

While the whitepaper provides compelling evidence that harness and context are critical, it remains unclear how organizations will standardize these practices at scale. The specific methodologies for developing effective harnesses and the associated costs are still evolving, and industry adoption may vary widely.

Amazon

AI harnessing tools

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Next Steps for AI Teams and Industry Adoption

Organizations are expected to begin reevaluating their AI development strategies, investing more in infrastructure, tooling, and skill development in harness and context engineering. Future research and case studies will likely explore best practices for scalable implementation and cost management in this new paradigm.

Amazon

context engineering platforms

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Key Questions

Why is the model only 10% of the system’s behavior?

The whitepaper argues that most of the system’s performance depends on how the AI is integrated, configured, and managed through harnesses, prompts, tools, and observability—these factors account for about 90% of effectiveness.

How can organizations improve their harness and context engineering?

By investing in structured workflows, developing reusable context schemas, and building robust tooling for configuration, testing, and monitoring, organizations can significantly enhance AI performance.

Does this mean model size is irrelevant?

No, larger models can still be beneficial, but their impact is limited unless complemented by strong harnesses and context management. The whitepaper emphasizes that model improvements alone are insufficient for optimal performance.

What are the risks of focusing less on models?

Overemphasizing models might lead organizations to overlook the importance of system design, security, and cost management. A balanced approach that prioritizes harness and context is recommended.

Will this approach reduce AI development costs?

Potentially, yes. By focusing on configuration and system management, organizations can lower ongoing operational costs and improve system reliability, leading to better cost efficiency over time.

Source: ThorstenMeyerAI.com

You May Also Like

The Death of the Identical Paragraph

The traditional news wire model is collapsing as AI rewriting reduces the need for syndication. This shift impacts journalism economics and attribution.

Building an AI Trading Bot — Week One: Why a 90 % Win Rate Can Still Lose Money

Analyzing week one of an AI trading bot experiment reveals high win rates may not indicate profitability. Key insights and uncertainties explained.

How to Choose AI-Powered Note-Taking Apps

Learn how to set up and optimize AI-powered note-taking apps to improve organization, productivity, and information retention.

The $60 Billion Bargain: Why Cursor Could Be a Steal for SpaceX

SpaceX’s purchase of AI coding tool Cursor for $60 billion is a strategic deal, leveraging rapid growth and vertical integration to gain a competitive edge.