v1.0 — March 2026

Preparing for the Age of AI
A Living Outlook for Decision-Makers

A structured framework for navigating AI advancement over the next 36 months — not a prediction, but preparation across plausible futures.

Explore the framework Download full study
Feedback: ai-scenarios@appliedai-institute.de

Disclaimer

Signal monitoring, news identification, and current reads on this page are generated with AI-based methods. Some relevant developments may be missing; others may be incorrectly interpreted. All summaries and perspective assessments are reviewed and supervised by human experts at the appliedAI Institute to ensure quality and accuracy.

Last news update

27 March 2026

Last expert review

27 March 2026

Summary

1

Purpose

Why this framework exists and who it serves.

This document provides a structured analytical framework for European decision-makers in policy, industry, and civil society. It offers three plausible AI advancement perspectives, assesses their probability through observable technical drivers, maps impacts across 10 key domains, and derives concrete response measures. The goal is not to predict one future, but to prepare across plausible futures.
2

Perspective Assessment

Three plausible perspectives for AI capability advancement over the next 36 months, differentiated solely by the speed of capability progress.

S1: Plateau 5%
AI capabilities improve only marginally. Impact driven by diffusion of existing systems.
Explore perspective →
S2: Continued Pace 50%
Steady capability gains. Agentic workflows mature. Uneven adoption widens competitive gaps.
Explore perspective →
S3: Accelerated 45%
Compounding gains with potential discontinuities. Very powerful systems possible within timeframe.
Explore perspective →
3

Perspective Impact Explorer

How AI advancement affects 10 key domains across Europe — from labor markets to local institutions. Each impact category shows opportunities, risks, and concrete descriptions of what life looks like from each perspective.

Category
S1: Plateau
S2: Continued
S3: Accelerated
Labor Market & Skills
Moderate
High
Severe
S1: Plateau

Entry-level roles require AI proficiency. Companies hire fewer juniors as AI handles routine analysis. Productivity gap widens between AI-adopting and lagging firms.

S2: Continued

Entire job categories shrink — accounting, legal research, marketing, software testing. Mid-career professionals face reskilling pressure as AI handles multi-step workflows.

S3: Accelerated

Labor market transforms within 2–3 years in ways that typically took decades. Junior associate work largely automated. Continuous reskilling becomes survival requirement.

If done right: AI productivity gains shared broadly. Work hours reduce to 10–20h/week while maintaining living standards. "Human-only" work commands higher recognition and wages.
Public Finance & Social Systems
Manageable
Significant
Crisis
S1: Plateau

Modest shifts in tax revenue composition. Unemployment claims elevated but manageable. AI-augmented firms become more productive and profitable.

S2: Continued

Treasury departments plan for tax base transitions. Payroll taxes decline in affected sectors. Discussions emerge about AI-specific levies and consumption taxes.

S3: Accelerated

Fiscal systems face potential crisis. Tax revenues from labor decline faster than governments can adapt. Unemployment insurance approaches capacity limits in some regions.

If done right: AI-boosted GDP growth funds adjusted tax systems that share value across all groups. UBI becomes feasible. Public services themselves become more efficient through AI augmentation.
Industry & Competitiveness
Moderate
High
Severe
S1: Plateau

AI quality control improves defect detection. Adoption is uneven — larger companies lead while SMEs struggle to access AI capabilities and expertise.

S2: Continued

Competitive pressure intensifies. American competitors release updates twice as fast. AI compresses development cycles. European firms must adapt or lose market position.

S3: Accelerated

Industrial structure transforms within years. Traditional European strengths partially commoditized as AI enables competitors to match quality at lower cost and faster speed.

If done right: Europe enters the age of semi-autonomous factories. AI-first companies develop new infrastructure for agent communication. Europe leads in complex supply chain management.
Innovation & Startups
Moderate
High
Severe
S1: Plateau

Startups reach $10M ARR with 20 employees instead of 50. Revenue per employee rises. Competition intensifies as AI lowers barriers to entry.

S2: Continued

Startup landscape restructures. Billion-dollar valuations achievable with dozens of employees. Traditional VC model faces structural shift as capital requirements drop.

S3: Accelerated

The concept of a "startup" dissolves. Single-person billion-dollar companies materialize. Software becomes near-free commodity. Value shifts to data, relationships, and trust.

If done right: Everyone with ambition and AI access can build a business that matters. Language barriers dissolve. Europe's innovation potential is steered toward a values-based future.
Science System
Mixed
High
Severe
S1: Plateau

AlphaFold transforms structural biology. AI-assisted literature review helps find relevant research. But integrity challenges emerge — hallucinated citations appear in venues.

S2: Continued

AI becomes essential research infrastructure. Labs without AI cannot compete. Drug discovery timelines compress. Integrity verification systems strained significantly.

S3: Accelerated

Scientific research transforms more than since the scientific revolution. AI contributes to hypothesis generation and experimental design. "Scientific taste" becomes critical bottleneck.

If done right: Universities become centers that steer AI-powered science. Human-agent research teams discover breakthroughs for cancer, clean energy, and sustainable materials at unprecedented speed.
Security & Resilience
Elevated
High
Severe
S1: Plateau

AI improves threat detection but attackers also use AI. Phishing becomes more convincing. Arms race between offense and defense continues at elevated level.

S2: Continued

Attackers deploy AI agents that autonomously probe systems and coordinate attacks. First major AI-vs-AI cyber conflicts emerge. Enterprise security requires fundamental rethinking.

S3: Accelerated

Autonomous cyber weapons become the norm. Multi-stage attacks unfold faster than human response. Critical infrastructure protection requires AI-speed defense systems.

If done right: Agent-based engineering makes software more reliable and secure. Separated infrastructure for critical systems removes attack vectors. Defensive AI matches offensive capabilities.
Digital Public Sphere
High
Severe
Catastrophic
S1: Plateau

Deepfakes are common nuisances. Doctored videos circulate within hours. Voice-cloning fraud increases. Fact-checkers struggle to keep pace with synthetic content.

S2: Continued

Information environment reaches a tipping point. More online content is AI-generated than human-created. Platforms struggle to enforce authenticity policies.

S3: Accelerated

"Authentic" digital content becomes nearly meaningless. Real-time personalized manipulation at population scale. Democratic deliberation faces existential challenge.

If done right: Authentication and provenance systems verify human content. AI enables immense creativity while trust is maintained. Freedom of speech remains a human right.
Health & Care
Positive
Strongly +
Transformative
S1: Plateau

Radiologists work alongside AI that pre-screens imaging. Documentation time drops significantly. Administrative burden decreases across clinical settings.

S2: Continued

AI becomes standard clinical tool across specialties. AI-assisted diagnosis catches conditions doctors might miss. Drug discovery timelines compress significantly.

S3: Accelerated

Healthcare transforms more in 3 years than in the previous 30. AI diagnostics exceed human specialists. Physicians shift from diagnosticians to care orchestrators.

If done right: A real system for individual wellbeing emerges — nutrition, sleep, exercise, diagnostics, treatments. Available for everyone to decide how deeply integrated AI should be in their life.
Education System
Significant
High
Severe
S1: Plateau

80%+ of students use AI for writing. Detection tools are unreliable. Innovative schools pilot AI-augmented learning; most struggle to adapt curricula and assessment.

S2: Continued

Assessment enters genuine crisis. Traditional exams become meaningless as AI produces expert-level work. Universities experiment with oral exams and project-based evaluation.

S3: Accelerated

Education faces existential questions about its purpose. If AI performs most cognitive tasks, what should schools teach? Emphasis shifts to uniquely human capabilities.

If done right: Children grow up with personal AI teachers that individually support their education. Every child learns effectively based on their interests and capabilities, regardless of background.
Local Institutions
Moderate
High
Severe
S1: Plateau

Forward-thinking cities offer smooth AI-assisted permitting. Documents processed in days rather than weeks. But most municipalities lack resources and expertise to adopt.

S2: Continued

Gap between AI-enabled and traditional local government widens dramatically. Leading cities offer transformed citizen experience. Lagging communities fall further behind.

S3: Accelerated

Speed of change exceeds institutional capacity in many jurisdictions. Cities that transformed deliver unprecedented service quality. Others face potential crisis of legitimacy.

If done right: AI broadly benefits communities and individual lives. Resistance is accepted — societies make space for different relationships with technology. Overall acceptance remains strong.

Click any row to see perspective descriptions and positive vision

Explore detailed impact analysis
4

Collection of Response Measures

Measures derived from impact analysis, weighted by probability and time criticality. Distinguishing no-regret actions (needed regardless of perspective) from perspective-conditional preparations.

Perspectives

We define three plausible perspectives for AI capability advancement between 2026 and 2029. They are differentiated solely by the speed of technical progress — not by policy choices, adoption patterns, or societal responses, which are treated as consequences rather than defining features. The perspectives are mutually exclusive and collectively exhaustive: one of them will most closely describe what actually happens. Each perspective carries distinct implications for the urgency, scale, and nature of the response measures European decision-makers should pursue.

Perspective 1

Plateau

5%

AI capabilities improve only marginally compared to today's frontier models. The rapid capability gains observed in recent years do not continue at the same pace. Progress becomes incremental rather than step-change, as the fundamental architectures and training paradigms encounter diminishing returns or unforeseen bottlenecks.

Typical Improvements

  • Incremental gains in reliability, latency, and cost efficiency
  • Better user interfaces and integration tooling
  • Modest improvements in specific domains through fine-tuning and specialisation
  • Continued reduction in deployment friction (APIs, SDKs, enterprise connectors)

What Remains Difficult

  • Robust long-horizon autonomy (multi-step tasks over extended timeframes with minimal supervision)
  • Consistently correct reasoning under ambiguity and uncertainty
  • Reliable operation in open-ended, high-stakes environments
  • Verifiable alignment with complex human intentions
  • Sim-to-real transfer for robotics remains a significant barrier
S1 represents the generally accepted lower boundary for capability advancement. All impacts described for S1 should be understood as minimum expected impacts, and all measures identified for S1 are the absolute minimum response required.

Why This Perspective Still Matters

A plateau in capability does not mean a plateau in impact. Current AI systems are already capable enough to transform significant portions of knowledge work, administrative processes, and creative production. The primary driver of change is diffusion and adoption, not new capabilities. Organisations that invest in process redesign, workforce development, and systematic integration can realise substantial gains. Those that do not will fall behind — not because of new breakthroughs, but because competitors extract more value from existing capabilities.

Perspective 2

Continued Pace

50%

AI capabilities continue to improve at roughly the pace observed in recent years; agentic workflows become materially more useful. Frontier models become meaningfully more capable through a combination of scaling, algorithmic improvements, better training data, and advances in post-training techniques (reinforcement learning, inference-time reasoning, tool use). Progress is steady but not explosive.

Typical Improvements

  • Stronger planning and multi-step reasoning capabilities
  • More reliable tool use and structured outputs (code, data manipulation, API calls)
  • Improved grounding on enterprise data and context
  • Better multimodal handling (text, image, audio, video in combination)
  • Partially autonomous workflows for bounded domains (software development, analytics, customer service, document processing)
  • Significant improvement in sim-to-real transfer for robotics, enabling practical deployment in logistics, manufacturing, and service environments

New Normal (12-36 months)

  • AI agents that can execute multi-step workflows with moderate supervision (e.g., draft-review-revise cycles in writing, coding, or analysis)
  • Autonomous handling of well-defined operational tasks (scheduling, monitoring, routine decisions) under explicit controls
  • Software development becomes substantially automated, with AI handling the majority of routine coding, testing, and debugging
  • Core organisational processes begin running with limited human supervision in leading organisations
Diffusion remains uneven. Organisations with strong change capacity, data infrastructure, and AI-literate workforces pull ahead. Laggards, including many public sector bodies and SMEs, struggle to keep pace. The gap between leaders and followers widens.
Perspective 3

Accelerated

45%

AI capabilities compound rapidly; discontinuities in autonomy and R&D automation become plausible within 36 months. Multiple drivers of progress align and reinforce each other. Advances in architectures, training methods, or emergent capabilities produce step-changes that accelerate the rate of improvement itself. AI systems begin to contribute meaningfully to AI research, creating feedback loops that compress timelines significantly. The possibility of reaching highly capable, general-purpose AI systems enters the plausible range.

Expert Timelines

Expert Organisation Timeline
Dario Amodei Anthropic 2026-2027
Sam Altman OpenAI Mid-2020s
Demis Hassabis Google DeepMind 2028-2030
Eric Schmidt Former Google CEO 2-6 years

Typical Improvements

  • Rapid cycles of capability increases, with shorter intervals between major advances
  • Substantially more autonomous execution across broad task categories
  • Meaningful AI contribution to research and engineering workflows (experiment design, code generation, debugging, scientific discovery)
  • Emergence of capabilities that were not explicitly trained or anticipated
  • Robotic applications expand dramatically, including viable deployment of humanoid robots in commercial and industrial settings

Agentic Capabilities

  • Autonomy in substantial end-to-end processes becomes viable, not just bounded subtasks
  • AI systems can manage complex, multi-stage projects with minimal human intervention
  • Coordination between multiple AI agents becomes routine for complex workflows
  • Human oversight shifts from task-level supervision to goal-level governance

Stress Points

  • Policy and regulation: Governance frameworks designed for slower change become outdated before implementation is complete
  • Education and workforce: Curricula cannot adapt fast enough; cohorts of workers face rapid skill obsolescence across many professions simultaneously
  • Public services: Demand for support (unemployment, retraining, fraud prevention) may spike while institutional capacity lags severely
  • Information integrity: Synthetic content may fundamentally outpace detection capabilities, challenging the foundations of trust in digital communication
  • Economic structure: Questions about value distribution and employment arise faster than political systems can address them
Competitive pressure becomes extreme. Product and service cycles compress dramatically; the productivity gap between AI-native organisations and traditional competitors becomes a chasm. Workforce transformation cannot wait for gradual reskilling. Some organisations will thrive by embracing radical transformation; others will face severe disruption despite best efforts, simply due to the speed of change.
Perspective Comparison

A side-by-side comparison of key dimensions across all three perspectives.

Dimension S1: Plateau S2: Continued Pace S3: Accelerated
Capability Speed Marginal improvement; gains mostly in cost and latency Steady capability gains; regular new model generations Compounding gains; potential discontinuities
Agentic Reliability Narrow, tool-like assistance Multi-step workflows with moderate supervision Substantial end-to-end autonomy
R&D Automation Assists coding and routine analysis AI handles majority of routine engineering Closed-loop acceleration; AI contributing to AI research
Robotics Lab-to-production gap persists Viable commercial deployment at scale Economically transformative; mass deployment
Adoption Pattern Diffusion of current capabilities Uneven but accelerating adoption Forced adaptation; speed of change exceeds institutional capacity
Institutional Stress Manageable; existing frameworks adapt Significant strain; governance gaps emerge Potential crisis; frameworks outdated before implemented
Competitive Dynamics Gradual shifts Intensifying pressure; leaders pull ahead Existential pressure in exposed sectors
Key Opportunity Systematic integration of existing AI Productivity leadership through adoption Fundamental transformation of economic structure
Key Risk Complacency; missed adoption window Competitiveness gap with AI-leading regions Loss of agency; societal disruption
Decision-Maker Focus Build adoption capacity Accelerate governance + adoption Emergency preparedness + societal resilience
Base Conditions ? These conditions apply across all three perspectives. They represent structural realities that cannot be changed within the 36-month timeframe.

Before examining perspectives, we establish baseline conditions that apply regardless of which perspective materializes. These are structural realities that constrain change speed and Europe's room for maneuver.

Compute, Energy & Infrastructure

  • Top 4 US hyperscalers are investing over $600 billion in 2026 alone in AI infrastructure.
  • Data center buildout takes 2-4 years from decision to operation, creating a binding constraint on near-term compute supply.
  • Grid capacity is the binding constraint in many regions. New data centers are competing for limited power.
  • Chip supply is concentrated outside Europe. TSMC, Samsung, and Intel dominate advanced node fabrication.
  • Quantum computing will not materially affect AI capabilities within this 36-month timeframe.

Practical Implication

Near-term progress depends primarily on efficiency gains: better algorithms, improved training recipes, and inference optimization, rather than raw compute scaling alone.

Europe's Dependency Exposure

Layer Dependency
Chips Advanced logic chips fabricated almost exclusively outside Europe (TSMC, Samsung, Intel)
Cloud US-based hyperscalers dominate; no single European player suited for large-scale AI workloads
Foundation Models Majority of frontier models from US or Chinese organizations; European alternatives smaller-scale
Talent Strong research talent base but intense retention competition from US labs offering higher compensation

Practical Implication

Europe's position in the AI era will be shaped far more by operational competence — how effectively organizations adopt, deploy, and govern AI — than by resolving these structural dependencies within the timeframe.

Institutional & Organizational Capacity

  • Public sector: 12-24 month procurement cycles, generally low AI literacy among decision-makers, and rigid organizational structures limit adoption speed.
  • Large enterprises: Accelerating AI adoption but highly uneven across sectors and functions. Pilot-to-production gap remains significant.
  • SMEs: Constrained by limited resources, expertise, and access to AI talent. Many lack even basic digital infrastructure.

Key Insight

The execution gap — the distance between what AI can technically do and what organizations actually achieve with it — is the central determinant of realized impact. This gap is driven by organizational capacity, not technology.

Competitive Pressure & Forced Adaptation

  • Speed-to-market compression: AI-enabled competitors can iterate and ship faster, reducing the window for traditional players to respond.
  • Productivity arbitrage: Organizations using AI effectively achieve 20-50% productivity gains, creating cost advantages that compound over time.
  • Winner-take-more dynamics: Network effects and data advantages amplify early-mover benefits, widening gaps between leaders and laggards.
  • The "San Francisco consensus": Silicon Valley leaders broadly expect the world to change fundamentally within approximately 3 years. Whether or not this proves correct, the capital and talent being deployed against this belief creates its own momentum.

Key Takeaway

For organizations in competitive sectors, AI adaptation is not optional — it is a survival requirement. The question is not whether to adopt, but how quickly and effectively.

Assessment of Perspective Probabilities ? Probabilities are best-guess estimates for decision-making, not precise forecasts. They are re-estimated when monitoring detects material changes.

We identify four core technical drivers that determine which perspective materializes, track them via observable signals, and consolidate them into probability estimates. A driver is included only if it is (a) highly uncertain within 36 months, (b) meaningfully affects capability speed, and (c) can be tracked via observable signals. Topics like compute buildout, open-source ecosystem health, and regulation are excluded — compute is a baseline condition (Chapter 1), while open-source and regulation affect deployment, not fundamental capability speed.

1 Architectures & Training Paradigms

Whether AI progress comes from scaling within current paradigms or whether fundamentally new architectures emerge. This driver determines the ceiling of what is possible.
S1: Plateau

No paradigm shift materializes. Diminishing returns from scaling within current transformer architectures. Post-training and efficiency gains are incremental. ARC-AGI-2 performance stagnates below 50%.

S2: Continued Pace

Continued engineering gains within existing paradigms. Substantial post-training advances (reasoning models, multimodal integration) without requiring architectural revolution. Steady compounding of improvements.

S3: Accelerated

Either scaling proves to have more headroom than expected, yielding surprising capability gains, or a genuine paradigm shift emerges (nested learning, causal AI, neuromorphic). ARC-AGI-2 performance approaches human level.

Signals to Monitor
ARC-AGI-2 & Novel Reasoning Benchmarks ? ARC-AGI-2 tests genuine adaptation with both high adaptability and high efficiency. It is considered the gold standard for measuring AI's ability to handle truly novel situations that cannot be solved by pattern matching alone. The benchmark requires both high accuracy AND cost efficiency — the human panel achieved 100% at just $17 per task. [16]
Tests genuine adaptation vs. pattern matching — the gold standard for measuring if AI can handle novel situations beyond training data. Tracks both raw scores and cost-efficiency.
Perspective Alignment ▶ Trending toward S3
S1
S2
S3
PlateauAccelerated
Current Read
Human panel baseline: 100% at $17/task. Best AI: Google Gemini 3 Deep Think achieves 84.6% at $13.62/task — rapidly closing the gap in both performance and cost-efficiency. Previous best (o3) scored 75.7% at significantly higher cost. The trajectory shows quarterly improvements of 5-10 percentage points, suggesting human-level performance on this benchmark could be reached within 12-18 months. The cost-efficiency improvements are particularly notable, as they indicate fundamental capability gains rather than brute-force scaling. [16]
Confidence: Medium-High — well-defined benchmark with clear methodology, but novel and less track record than established benchmarks
Recent News
Coming soon — will be updated with latest developments
Sources: [16]
New Architecture Announcements ? The Architecture Debate: Three positions define the field. Scaling advocates (OpenAI, xAI) argue current transformer architectures have substantial headroom — the next 10x compute will yield major gains. Architecture innovators (DeepMind, INAIT, Sakana.ai) believe transformers have fundamental limits in context, reasoning, and world understanding. Hybrid perspective holds that progress comes from both: scaling current systems while integrating architectural innovations. Key paradigms: Nested Learning (DeepMind, NeurIPS 2025) introduces continuum memory systems and deep optimizers. Causal AI (EPFL/INAIT) shifts from correlational pattern matching to causal understanding, with Microsoft Azure partnership. [20][21][22][23]
Would signal a paradigm shift away from transformers. Tracks major lab releases, benchmark results on new architectures, and production deployment of non-transformer systems. The strongest accelerant for Perspective 3.
Perspective Alignment ⏸ Stable / Watching
S1
S2
S3
PlateauAccelerated
Current Read
Multiple new approaches are in active development but none have reached production deployment. Key initiatives: (1) Nested Learning (Google DeepMind, NeurIPS 2025) — continuum memory systems and deep optimizers showing promising results on continual learning tasks. (2) INAIT / Causal AI (EPFL spin-off) — strategic Microsoft Azure partnership for deploying causal reasoning at scale. Three product lines: inference engines, discovery platforms, intervention optimizers. (3) Sakana.ai — evolutionary approaches to architecture search, "AI Scientist" framework for automated research. (4) AMI Labs — bio-inspired architectures. Assessment: Research is vibrant but remains pre-production. A verified breakthrough here would be the single strongest signal for Perspective 3. [17][20][21][22][23]
Confidence: Medium — multiple credible efforts but high uncertainty about which (if any) approach will achieve transformative results
Recent News
Coming soon — will be updated with latest developments
Architectural Innovation Research ? Tracks the volume and quality of research papers on fundamentally new AI paradigms beyond transformers — including continual learning, causal reasoning, neuro-symbolic AI, world models, and recursive self-improvement. A sustained increase in breakthrough publications would indicate the field is moving beyond incremental engineering gains. [20][82]
Tracks publication volume and quality across continual learning, causal AI, recursive self-improvement, and world model research. An acceleration here signals paradigm shift potential.
Perspective Alignment ▶ Growing momentum
S1
S2
S3
PlateauAccelerated
Current Read
Active research across multiple organizations with growing publication volume. Key threads: continual learning frameworks (DeepMind's nested learning), causal AI platforms (INAIT/EPFL with Microsoft partnership), neuro-symbolic integration, and world model development. Publication volume in these areas has increased roughly 40% year-over-year at top venues (NeurIPS, ICML, ICLR). While no single approach has demonstrated clear superiority over fine-tuned transformers on production tasks, the breadth of exploration suggests the field is actively hedging against transformer limitations. The growing involvement of major labs (Google DeepMind, Meta FAIR, Microsoft Research) adds credibility. [20][21][82]
Confidence: Medium-High — broad research activity is clearly measurable, though predicting which specific approach will succeed remains uncertain
Recent News
Coming soon — will be updated with latest developments
Sources: [20] [21] [82]
Compute Infrastructure Scaling ? Massive compute investment is both a signal and a driver. It tests whether scaling laws continue to hold — if they do, more compute directly translates to more capable models. The $600B+ investment level in 2026 reflects deep industry conviction that scaling still works. Key question: are we seeing diminishing or accelerating returns per FLOP? [1]
Tests whether scaling laws continue to hold. Massive infrastructure investment signals deep industry conviction. Tracks data center capacity, training run scales, and efficiency metrics.
Perspective Alignment ▶ Strong S2-S3 signal
S1
S2
S3
PlateauAccelerated
Current Read
Investment levels are unprecedented and accelerating. OpenAI is targeting 1 GW of compute capacity in early 2026, with an additional 1 GW planned by year-end — representing a massive expansion from the ~100 MW scale of 2024 training runs. Hyperscaler capex (Meta, Microsoft, Amazon, Google) exceeded $250B in 2025 and is projected at $350B+ for 2026. The industry consensus implied by these investments is that scaling continues to yield capability gains, though the returns-per-dollar question remains actively debated. Notably, efficiency improvements (better architectures, quantization, distillation) are compounding with raw scale increases, meaning effective compute is growing faster than FLOP counts alone would suggest. [1]
Confidence: High — investment data is publicly verifiable and infrastructure buildout is a strong commitment signal
Recent News
Coming soon — will be updated with latest developments
Sources: [1]
World Model Demonstrations ? World models represent a core capability gap identified by skeptics of current AI (notably Yann LeCun). Moving from statistical language models to systems with genuine persistent world understanding is considered by many researchers as necessary for achieving robust reasoning and physical interaction. LeCun's group at Meta has invested $3.5B in JEPA-based world model research. [14][19]
Tracks development of systems with genuine persistent world understanding — the gap between language models and true reasoning. Core capability frontier identified by skeptics of current approaches.
Perspective Alignment ⏸ Pre-production
S1
S2
S3
PlateauAccelerated
Current Read
Research is active across multiple major labs but no production deployment exists yet. Key programs: (1) DeepMind Genie / Genie 2 — generative world models for interactive environments, demonstrated impressive results on game-like environments but not yet applied to real-world tasks. (2) Meta JEPA-2 — Yann LeCun's joint-embedding predictive architecture, designed for learning world models from video data. Meta has committed $3.5B to this research direction. (3) AMI Labs — bio-inspired approaches to persistent world modeling. Assessment: This remains the most uncertain signal. A production breakthrough here would be transformative, but the gap between research demos and deployed systems remains wide. The timeline for real-world impact is likely 2-4 years beyond benchmark demonstrations. [14][19]
Confidence: Low-Medium — research direction is clear but production viability highly uncertain; multiple credible approaches but none proven at scale
Recent News
Coming soon — will be updated with latest developments
Sources: [14] [19]
Expert Views on Architectures & AGI Timelines
Expert Organization AGI Timeline Architecture View Source
Dario Amodei Anthropic 2026–2027 Current paradigm + scaling + AI-assisted R&D likely sufficient [6]
Sam Altman OpenAI Mid-2020s Scaling + reasoning modules (o-series) on path to AGI [7]
Demis Hassabis Google DeepMind 2028–2030 Extensions within deep learning; continual learning, world models needed [9]
Yann LeCun AMI Labs (ex-Meta) Not within 2 years Transformers insufficient; world models and causal reasoning required [14]
Llion Jones Sakana AI 1–2 breakthroughs away Recursive self-improvement needed; warns of transformer "gravitational well" [17]
Henry Markram INAIT/EPFL New paradigm emerging Causal AI — brains are participants, not observers; fundamentally different from transformers [18]
Ilya Sutskever SSI (ex-OpenAI) Coming, timing uncertain Scaling alone not sufficient; fundamental rethinking of pre-training needed [12]

These views are tracked and updated regularly as experts revise their positions.

Directional Impact on Probabilities
→ S3: Evidence of paradigm shift (Nested Learning, Causal AI, or other approaches achieving production deployment with verified superior performance). → S2: Sustained engineering gains without paradigm shift. → S1: Diminishing returns on scaling and post-training. Current assessment: Multiple promising architectural innovations reduce S1 probability; strong investment in scaling supports S2; breakthrough potential elevates S3.

2 Agentic Autonomy & Orchestration

The ability of AI systems to plan, execute multi-step tasks, use tools, and operate autonomously in real-world environments. This driver determines the practical impact frontier.
S1: Plateau

Agents remain unreliable beyond narrow, well-defined tasks. Error rates too high for unsupervised deployment. Human oversight remains essential for all but the most routine operations.

S2: Continued Pace

Practical agentic AI in bounded enterprise workflows. Agents handle multi-step tasks with bounded autonomy (hours). SWE-bench approaches 90%. Growing real-world deployment.

S3: Accelerated

Substantial end-to-end autonomy across complex, open-ended tasks. SWE-bench >95%. Agents routinely manage multi-day workflows. Paradigm shift in how knowledge work is organized.

Signals to Monitor
SWE-bench Progression ? SWE-bench is the gold standard for measuring AI's ability to resolve real GitHub issues. The "bash" variant tests end-to-end resolution; "Verified" uses human-confirmed solvable issues. These benchmarks directly measure whether AI can do meaningful software engineering work autonomously. The rapid improvement trajectory (from ~30% to ~80% in 18 months) is one of the strongest signals for advancing agent capability. [24][25][71][72]
Direct measure of practical coding and real-world problem solving capability. Tracks quarterly leaderboard updates across bash and verified variants. The most widely-cited agent capability benchmark.
Perspective Alignment ▶ Rapid improvement
S1
S2
S3
PlateauAccelerated
Current Read
SWE-bench Bash: 76.8% (Claude Opus 4.5) — has reached or exceeded the estimated human baseline of 70-80%, effectively closing the gap. SWE-bench Verified: 79.2% (Claude Opus 4.5 high compute) — 21 points below the 100% ceiling (by design, all issues are solvable). The trajectory has been remarkable: scores improved from ~33% in mid-2024 to nearly 80% in early 2026, representing roughly a doubling every 6-8 months. At this rate, 90%+ on Verified could be reached by late 2026/early 2027. Sonar also claimed the top spot, validating multi-vendor competition. [24][25][71][72]
Confidence: High — well-established benchmark with transparent methodology, multiple competing models, and publicly verifiable leaderboard
Recent News
Coming soon — will be updated with latest developments
Sources: [24] [25] [71] [72]
OSWorld Progression ? OSWorld measures AI's ability to interact with real operating system environments — using applications, managing files, navigating GUIs. This is critical for agent deployment because it tests the kind of tasks that enterprise agents need to perform. The dramatic improvement from <5% to 73% in roughly one year is among the fastest benchmark gains ever observed in AI. [28]
Measures real-world computer environment interaction — using applications, managing files, navigating GUIs. Critical for determining whether AI agents can operate in actual work environments.
Perspective Alignment ▶ Dramatic acceleration
S1
S2
S3
PlateauAccelerated
Current Read
Current best: 73% — representing one of the most dramatic improvements in AI benchmarking history. One year ago, the best scores were below 5%. The human baseline is 72%, meaning AI has now surpassed human-level performance on standardized OS interaction tasks. This benchmark tests practical tasks like installing software, managing files across applications, configuring system settings, and completing multi-step workflows across desktop environments. The near-vertical improvement curve suggests that once multimodal agents achieved basic visual understanding of GUIs, performance gains compounded rapidly. Key question going forward: will this translate to reliable real-world deployment, or do benchmark conditions (clean environments, well-defined tasks) overstate production readiness? [28]
Confidence: High — transparent benchmark with publicly available evaluation framework; improvement trajectory is clearly measurable
Recent News
Coming soon — will be updated with latest developments
Sources: [28]
GAIA Scores ? GAIA (General AI Assistants benchmark) from Princeton measures holistic assistant capability — multi-step research tasks requiring reasoning, web browsing, tool use, and synthesis. Unlike narrow benchmarks, GAIA tests the kind of end-to-end capabilities that define a truly useful AI assistant. The 17-point gap to human performance represents the clearest measure of remaining distance to general assistant capability. [26]
Holistic measure of multi-step research, reasoning, and tool use capability. The most comprehensive test of general AI assistant competence. Tracks top scores and gap to human baseline.
Perspective Alignment ▶ Steady closing
S1
S2
S3
PlateauAccelerated
Current Read
Top score: 74.55% (Claude Sonnet 4.5) vs. 92% human baseline — a 17-point gap. GAIA is deliberately harder than most benchmarks because it requires integrating multiple capabilities: web research, mathematical reasoning, code execution, and multi-step planning. The gap has been narrowing at roughly 3-4 points per quarter. At this rate, human parity could be approached by mid-2027. Importantly, GAIA performance correlates well with real-world assistant utility — models that score higher on GAIA tend to be rated as more helpful in user studies. The remaining gap is concentrated in Level 3 tasks requiring the most complex multi-step reasoning. [26]
Confidence: High — well-designed benchmark from Princeton with clear human baselines and multi-level difficulty structure
Recent News
Coming soon — will be updated with latest developments
Sources: [26]
Enterprise Deployment Scale ? Enterprise deployment is the ultimate market validation for agentic AI. Benchmarks show capability, but production deployment shows willingness to trust agents with real business processes. The shift from pilots to production deployment is a qualitative signal that agent reliability has crossed a critical threshold. Key metrics: number of production deployments, task complexity, and autonomous operation duration. [2][105]
Market validation — tracks real organizations deploying AI agents in production workflows. Monitors vendor releases, enterprise case studies, and the pilot-to-production conversion rate.
Perspective Alignment ▶ Accelerating adoption
S1
S2
S3
PlateauAccelerated
Current Read
Enterprise agent deployment is growing rapidly and transitioning from pilot programs to production systems. Major developments: all major cloud providers (AWS, Azure, GCP) now offer agent-building platforms. Deloitte's 2026 survey shows 67% of enterprises have moved at least one AI agent initiative from pilot to production, up from 23% in 2024. Key verticals leading adoption: financial services (fraud detection, compliance), software engineering (code review, testing), and customer service. The average autonomous task duration has increased from minutes to hours, with METR research showing this duration is doubling approximately every 7 months — extrapolating to 2-week autonomous task capability by 2029. Notable: Lovable (AI coding platform) reached unicorn status in just 8 months, signaling strong market demand for agent-based tools. [2][29][102][104][105]
Confidence: High — multiple independent data points from industry surveys, vendor announcements, and market research confirm the trend
Recent News
Coming soon — will be updated with latest developments
OpenClaw Adoption & Security ? OpenClaw represents the qualitative shift from chat-based AI to system-integrated agents that interact directly with computer systems — executing commands, managing files, browsing the web, and controlling applications autonomously. Cisco's security analysis found a 26% vulnerability rate in production deployments, highlighting that agent capability is advancing faster than security readiness. This tension between capability and safety is a key factor determining deployment speed. [32][33][123]
Measures maturity of system-integrated AI agents and security readiness. Tracks community adoption metrics, security incident reports, and vulnerability disclosure rates.
Perspective Alignment ▶ High adoption, security drag
S1
S2
S3
PlateauAccelerated
Current Read
Adoption is high and growing, but security concerns are a significant deployment brake. Community metrics show rapid adoption across developer ecosystems. However, Acronis security analysis revealed a 26% vulnerability rate in analyzed deployments — meaning roughly 1 in 4 agent deployments has exploitable security flaws. Cisco's research further confirmed that personal AI agents like OpenClaw represent a "security nightmare" when deployed without proper guardrails. Key vulnerabilities: prompt injection, privilege escalation, data exfiltration, and supply chain attacks on agent tool chains. The security community is responding with frameworks and standards, but adoption of security best practices lags behind deployment speed. This tension is likely to moderate the pace of enterprise adoption even as technical capabilities improve. [32][33][123]
Confidence: Medium — adoption data is strong but the security landscape is evolving rapidly, making risk assessment uncertain
Recent News
Coming soon — will be updated with latest developments
Sources: [32] [33] [123]
Multi-Agent Coordination ? Multi-agent systems allow multiple AI agents to collaborate on complex workflows — one agent researches, another codes, a third reviews, a fourth deploys. This is critical because many real-world tasks exceed the capability of a single agent but could be handled by coordinated teams. The emergence of production multi-agent systems (e.g., orchestrated coding pipelines) represents a qualitative shift in how AI work gets organized. [40]
Tracks the viability of complex workflows through multiple agents working together. Monitors research benchmarks, production deployments, and orchestration framework maturity.
Perspective Alignment ▶ Emerging rapidly
S1
S2
S3
PlateauAccelerated
Current Read
Active development with production systems beginning to emerge, though still early. Research benchmarks show multi-agent systems outperforming single agents on complex tasks by 20-40% in controlled settings. Production examples include multi-agent coding pipelines (research → implement → test → review) and customer service escalation chains. Key frameworks gaining traction: LangGraph, CrewAI, AutoGen. However, coordination overhead, error propagation between agents, and debugging complexity remain significant challenges. The gap between demo-quality multi-agent systems and production-reliable ones is substantial. Most production deployments use 2-3 agents in simple pipelines rather than the more complex topologies shown in research. [40]
Confidence: Medium — concept is proven in research but production-grade multi-agent systems are still maturing; limited standardized benchmarks for evaluation
Recent News
Coming soon — will be updated with latest developments
Sources: [40]
Directional Impact on Probabilities
→ S3: SWE-bench exceeding 90%; OSWorld gap widening further; enterprise deployment at scale. → S2: Continued benchmark improvement + successful but bounded enterprise deployment. → S1: Major security incidents slowing deployment; reliability plateaus. Current assessment: SWE-bench at human parity, OSWorld jump from 12%→73%, and GAIA at 74.55% strongly support S2/S3.

3 Automation of AI R&D and Software Engineering

The degree to which AI accelerates its own development and automates software engineering. This is the potential 'swing factor' between Perspective 2 and Perspective 3.
S1: Plateau

AI assists but doesn't meaningfully accelerate research. Code generation remains useful but doesn't fundamentally change R&D pace. Productivity gains are modest (10-20%).

S2: Continued Pace

AI materially accelerates software engineering — 50-70% productivity gains. Research cycles compress. AI contributes to experiment design and analysis but human researchers still drive direction.

S3: Accelerated

AI contributes substantially to AI research itself. Feedback loops emerge: better AI makes better AI faster. 90%+ code generation, autonomous experiment pipelines, novel research contributions.

Signals to Monitor
SWE-bench Verified Scores ? SWE-bench Verified uses a curated subset of GitHub issues that have been human-verified as solvable. This makes it a more reliable measure than the standard version, which may include ambiguous or unsolvable issues. The 79% score represents AI's ability to independently resolve real software engineering problems — the most direct proxy for measuring AI's R&D automation potential. [24][25]
Gold standard for real-world software engineering capability. Tracks quarterly leaderboard updates on human-verified solvable GitHub issues. The most direct measure of AI's ability to automate software development.
Perspective Alignment ▶ Rapid improvement
S1
S2
S3
PlateauAccelerated
Current Read
79.2% (Claude Opus 4.5, high compute configuration, early 2026). The improvement trajectory has been steep: from ~33% in mid-2024 to nearly 80% in early 2026. Multiple models now cluster near the top (Claude, GPT-5, Gemini 3), indicating that this isn't a single-model fluke but a broad capability frontier advance. The 21-point gap to the theoretical 100% ceiling represents the hardest issues — those requiring deep codebase understanding, multi-file coordination, and architectural judgment. Achieving 90%+ would strongly signal Perspective 3, as it would mean AI can handle essentially all standard software engineering tasks independently. Sonar also briefly claimed the top spot, showing strong competition is driving rapid improvement. [24][25][71][72]
Confidence: High — transparent, reproducible benchmark with multiple competing evaluations and clear methodology
Recent News
Coming soon — will be updated with latest developments
Sources: [24] [25] [71] [72]
LiveCodeBench Scores ? LiveCodeBench uses fresh competitive programming problems that post-date model training cutoffs, making it virtually impossible to game through memorization. This contamination-free design makes it one of the most trustworthy measures of genuine coding capability. Near-ceiling performance here (91.7%) suggests that AI has essentially mastered algorithmic problem-solving. [35][79]
Contamination-free benchmark using fresh competitive programming problems. Impossible to game through memorization. Tracks genuine algorithmic reasoning and code generation capability.
Perspective Alignment ▶ Near ceiling
S1
S2
S3
PlateauAccelerated
Current Read
91.7% (Gemini 3 Pro) — approaching ceiling performance on competitive programming tasks. This benchmark is particularly significant because it uses problems published after model training cutoffs, ensuring scores reflect genuine reasoning rather than memorization. The score has improved from ~60% in early 2025 to over 91% in early 2026. Combined with HumanEval (96.2%, o1-mini) and MBPP (~90-92%, GPT-5/Claude), the evidence strongly suggests that AI has essentially mastered standard code generation tasks. The remaining challenges are in complex, multi-step engineering projects (measured by SWE-bench) rather than individual coding problems. [34][35][79]
Confidence: High — contamination-free design and regular refresh of problem sets provide high methodological integrity
Recent News
Coming soon — will be updated with latest developments
Sources: [34] [35] [79]
AI Code Contribution Metrics ? These metrics measure AI's actual role in production R&D pipelines — not benchmark performance, but real-world impact. Anthropic's self-reported figure of ~90% AI-written code is the most striking data point, though as a self-reported metric from an AI company, it should be interpreted with appropriate context. The Pragmatic Engineer's independent analysis and Fortune's cross-company reporting provide additional validation. The 67% measured productivity increase is a concrete economic signal. [36][37][80][151]
Real-world measure of AI's role in production R&D pipelines. Tracks lab announcements, productivity data, and the percentage of code written by AI across organizations. The bridge between benchmark capability and actual economic impact.
Perspective Alignment ▶ Strong S3 signal
S1
S2
S3
PlateauAccelerated
Current Read
The most striking data point: ~90% of code at Anthropic is now written by AI, with a measured 67% productivity increase (Dario Amodei, CEO). Fortune reports that both Anthropic and OpenAI now have the majority of their code AI-generated. Claude Code (Anthropic's coding agent) has been described as an "inflection point" in how software engineering works — The Pragmatic Engineer's deep-dive analysis confirms that it has fundamentally changed development workflows at AI labs. Beyond leading AI companies, broader industry data shows AI code assistance adoption exceeding 70% among professional developers (GitHub Copilot, Cursor, Claude Code). The trend is clear: AI is transitioning from "assistant" to "primary author" of code, with humans increasingly serving as architects and reviewers rather than writers. [36][37][80][151]
Confidence: High — while self-reported metrics require caution, multiple independent sources (industry surveys, journalism, open-source contribution data) confirm the trend
Recent News
Coming soon — will be updated with latest developments
Sources: [36] [37] [80] [151]
AI-Authored Research ? This signal tests whether AI can contribute to scientific discovery, not just software engineering. The distinction matters because engineering automation accelerates existing processes, while research contribution could unlock fundamentally new capabilities. Key developments include Sakana.ai's "AI Scientist" framework and evidence of AI-generated papers at ML workshops. However, concerns about scientific integrity and hallucinated references remain significant. [17][110][113][115]
Tracks whether AI can contribute to scientific discovery and research, not just engineering tasks. Monitors AI-authored publications, "AI Scientist" results, and quality assessments at major conferences.
Perspective Alignment ▶ Early but growing
S1
S2
S3
PlateauAccelerated
Current Read
Growing evidence but quality remains mixed. Sakana.ai's "AI Scientist" framework demonstrated that AI can autonomously generate, implement, and write up novel ML research ideas — multiple papers were accepted at NeurIPS 2025 workshops. Terence Tao highlighted AI's contribution to solving Erdős Problem 126, demonstrating potential in mathematics. However, concerns are significant: a Columbia University analysis found widespread hallucinated references in AI-generated research, and SciELO reports growing challenges around scientific integrity. AI research contributions currently sit at "workshop quality" — competent but not matching top human researchers at flagship conferences. The key question is whether this represents a temporary quality ceiling or a stepping stone to deeper capability. Science reported that while AI has "supercharged scientists" in some ways, it may have "shrunk science" by narrowing the scope of exploration. [17][110][113][115][116]
Confidence: Medium — clear evidence of growing capability but quality assessment is subjective and integrity concerns complicate evaluation
Recent News
Coming soon — will be updated with latest developments
Closed-Loop Improvement Evidence ? The Swing Factor: This is what could push Perspective 2 into Perspective 3. If AI systems become capable enough to meaningfully accelerate AI research itself, the result is a compounding feedback loop — better AI makes better AI faster. Evidence includes: AI systems used to optimize training procedures, discover more efficient architectures, generate and run experiments, and improve their own code. Dario Amodei (Anthropic) and Demis Hassabis (DeepMind) have both indicated they see early evidence of this dynamic emerging. [6][9][10]
The critical "swing factor" — evidence that AI is meaningfully improving AI. Tracks research publications showing AI-driven capability jumps, lab announcements about recursive improvement, and measurable feedback loop acceleration.
Perspective Alignment ▶ Emerging evidence
S1
S2
S3
PlateauAccelerated
Current Read
Strong emerging evidence, but not yet self-sustaining. Key data points: (1) AI systems are actively used to optimize training procedures at major labs — automated hyperparameter tuning, data mixture optimization, and architecture search are now standard. (2) AI-assisted code generation for AI infrastructure itself (90% at Anthropic) means the tools are directly accelerating their own development pipeline. (3) Dario Amodei has described a "positive feedback loop" where Claude improvements accelerate Anthropic's development speed, which in turn accelerates next-generation model training. (4) DeepMind's AlphaFold and related systems demonstrate AI contributing to scientific breakthroughs that feed back into AI capability. However, the loop is not yet self-sustaining — human researchers still provide critical direction, novelty, and judgment. The transition from "AI-assisted" to "AI-driven" research acceleration is the key threshold to watch. This is the single most important signal for differentiating Perspective 2 from Perspective 3. [6][7][9][10][25]
Confidence: Medium-High — multiple credible indicators from lab leaders, though quantifying the feedback loop acceleration precisely remains difficult
Recent News
Coming soon — will be updated with latest developments
Sources: [6] [7] [9] [10] [25]
Directional Impact on Probabilities
→ S3: Strong evidence of closed-loop R&D acceleration (AI systems improving AI systems). → S2: AI assists but does not fundamentally accelerate research. → S1: Limited impact on research productivity. Current assessment: This is the key "swing factor" between S2 and S3. Claude Code (90% code at Anthropic, 67% productivity increase) provides concrete evidence that feedback loops are operational, not theoretical.

4 Robotics & Embodied AI

The convergence of AI capabilities with physical systems, primarily humanoid robotics. This driver determines when AI impact extends beyond the digital into the physical economy.
S1: Plateau

Remaining challenges (battery life, dexterity, true autonomy) prove harder than expected. Deployment remains limited to controlled industrial settings. Costs stay above $50K.

S2: Continued Pace

50,000+ humanoid robots deployed annually by 2028. Costs reach $20-30K. Reliable in structured environments. Manufacturing, logistics, warehousing transformed.

S3: Accelerated

100,000+ units annually. Costs below $20K by 2028. Capable in semi-structured environments. Beginning to handle service, healthcare, household tasks.

Signals to Monitor
Commercial Deployment Scale ? Commercial deployment volume is the most direct measure of whether humanoid robotics has crossed the viability threshold. China currently dominates with AgiBot (39% global market share), UBTech, Unitree, and XPeng all shipping at scale. The key threshold: when annual deployments exceed 50,000 units, it signals a self-sustaining market (Perspective 2+). Omdia's market analysis ranks AgiBot as the global #1 in shipments. [44][45][46][48][58][59]
Direct measure of economic viability and market readiness. Tracks unit deployments, customer announcements, market share data, and geographic distribution of humanoid robot shipments.
Perspective Alignment ▶ Rapid scale-up
S1
S2
S3
PlateauAccelerated
Current Read
Deployment is scaling rapidly, primarily driven by China. AgiBot: 5,100 units ordered with 39% global market share — ranked #1 worldwide by Omdia. The company has made its U.S. debut and opened its GO-1 platform for third-party development. UBTech: Walker S2 mass production initiated with orders exceeding 800 million yuan; production planned to increase 10x in 2026. Unitree: 5,500+ robots shipped in 2025, surpassing U.S. peers. Strategic partnership with Tencent Robotics X Lab. XPeng: Targeting mass production of Iron humanoid robot in 2026, with ambitions for 1 million units by 2030. Total market: estimated 10,000-15,000 humanoid robots shipped globally in 2025, with projections of 50,000+ for 2026-2027. China accounts for approximately 70% of global shipments. Western companies (Figure AI, Tesla Optimus, Boston Dynamics) are in earlier stages but scaling rapidly. [44][45][46][47][48][49][50][58][59][60]
Confidence: High — shipment data is verifiable through multiple independent sources including Omdia market research, press releases, and industry reporting
Recent News
Coming soon — will be updated with latest developments
Unit Cost Trajectories ? Cost is the key determinant of market scale. At $150K+, humanoid robots are limited to high-value industrial uses. At $30K, they become viable for manufacturing and logistics. Below $20K, consumer and small business markets open. The 20-30% annual cost decline trajectory is driven by Chinese manufacturing scale, component commoditization, and design simplification. Morgan Stanley and Goldman Sachs both project the humanoid robot market reaching tens of billions by 2030. [53][54][56][57]
Determines scalability and market accessibility. Tracks production costs, pricing announcements, component commoditization, and the cost-reduction trajectory needed for mass adoption.
Perspective Alignment ▶ On S2-S3 trajectory
S1
S2
S3
PlateauAccelerated
Current Read
Costs are declining rapidly along a clear trajectory. 2023: $150K-$500K (research and high-value industrial only). 2026: $30K-$150K depending on capability level — economically viable for manufacturing and logistics. 2027-2028 projected: $20K-$30K for capable industrial models. 2030+ projected: Sub-$20K, potentially approaching consumer accessibility. The 20-30% annual cost decline is driven by: Chinese manufacturing scale, standardization of key components (actuators, sensors, compute modules), and simplified designs optimized for specific use cases rather than general purpose. ROI analysis shows a 2,070% five-year return with approximately 2-month payback period for factory deployment at current prices. Goldman Sachs projects the humanoid robot market at $38B by 2035. Morgan Stanley's "Humanoid 100" report maps the full value chain from components to deployment. Bank of America projects 3 billion humanoid robots in service by 2060 with unit costs approaching $17K. [53][54][56][57][153]
Confidence: High — cost data is verifiable through manufacturer announcements, analyst reports, and component pricing; trajectory is consistent across multiple independent analyses
Recent News
Coming soon — will be updated with latest developments
Sim-to-Real Success Rates ? Sim-to-real transfer is the process of training robots in simulation and deploying them in the real world without further training. This has historically been a major bottleneck — policies that work perfectly in simulation often fail catastrophically in reality due to the "sim-to-real gap." The fact that Figure AI achieves >99% accuracy at BMW handling 90,000+ unique parts indicates this gap has been largely closed for structured industrial environments. The remaining challenge is unstructured, dynamic environments. [51][52]
Measures robustness of training transfer from simulation to real-world deployment. Tracks deployment case studies, failure rates, and the range of environments where sim-to-real transfer works reliably.
Perspective Alignment ▶ Effectively solved (structured)
S1
S2
S3
PlateauAccelerated
Current Read
For structured industrial environments, sim-to-real transfer is effectively solved. Figure AI at BMW: Handling 90,000+ unique parts with >99% accuracy in the Spartanburg factory. This represents a landmark deployment — the first large-scale integration of humanoid robots in premium automotive manufacturing. CATL deployment: 99% success rate in battery manufacturing lines. These are production environments with real consequences for failure. The key advances that enabled this: massive simulation environments (NVIDIA Isaac Sim, Google DeepMind's simulation stack), improved domain randomization techniques, and foundation models for robotic control. Remaining gap: Unstructured and dynamic environments (homes, hospitals, outdoor) still see significantly lower transfer rates. The progression from structured → semi-structured → unstructured environments maps directly to the S2 → S3 transition. [51][52][68]
Confidence: High — production deployment data from BMW and CATL are verifiable real-world results with clear success metrics
Recent News
Coming soon — will be updated with latest developments
Sources: [51] [52] [68]
Battery & Autonomy Improvements ? Battery life is the most significant remaining hardware constraint for humanoid robots. Current 1-4 hour operational times severely limit deployment perspectives — a robot that needs recharging every 2 hours cannot support a full factory shift. Unlike software capabilities (which are improving exponentially), battery technology follows a more linear improvement curve, making this a potential long-term bottleneck for Perspective 3. [65]
The most significant remaining hardware constraint. Tracks operational duration, energy density improvements, charging infrastructure, and the gap between current capability and continuous-operation requirements.
Perspective Alignment ⏸ Incremental only
S1
S2
S3
PlateauAccelerated
Current Read
Current operational time: 1-4 hours depending on task intensity and model. This is the single biggest hardware constraint. Improvement has been incremental only — roughly 5-10% per year, following battery chemistry improvement curves rather than the exponential curves seen in AI software capabilities. Key challenge: humanoid robots are power-hungry systems combining locomotion, manipulation, sensing, and computation. A single walking cycle consumes orders of magnitude more energy than a stationary computing task. Workarounds are emerging: hot-swappable battery packs, scheduled charging rotations (3 robots covering one 24/7 station), and tethered operation for stationary tasks. Solid-state batteries (expected 2027-2028 for commercial deployment) could provide a step-change improvement of 30-50% in energy density. Until battery life approaches 8+ hours, deployment will remain limited to shift-based operation rather than continuous autonomy. [65]
Confidence: Medium — battery technology trajectory is well-understood but breakthrough potential (solid-state, alternative chemistries) introduces moderate uncertainty
Recent News
Coming soon — will be updated with latest developments
Sources: [65]
Regulatory Framework Development ? Regulation can either enable or block deployment at scale. The EU AI Act provides a foundation for AI regulation broadly, but specific robotics safety standards for humanoid robots in shared workspaces are still being developed. China is pursuing a more permissive approach, which is contributing to its deployment lead. ISO safety standards for collaborative robots exist but weren't designed for the complexity of humanoid systems. Expected timeline for comprehensive frameworks: 2027-2029. [67][84]
Regulatory frameworks can enable or block deployment at scale. Tracks ISO safety standards development, national regulations, EU AI Act robotics provisions, and certification infrastructure maturity.
Perspective Alignment ⏸ In development
S1
S2
S3
PlateauAccelerated
Current Read
Regulatory frameworks are in active development but comprehensive standards remain 1-3 years away. EU: The AI Act provides a foundation for risk-based classification, but specific safety standards for humanoid robots in shared human environments are still being drafted. Existing collaborative robot standards (ISO 10218, ISO/TS 15066) were designed for industrial arms, not full humanoid systems. China: Pursuing a more permissive "sandbox" approach, contributing to faster deployment. Government policy explicitly supports humanoid robotics as a strategic industry. U.S.: OSHA and NIST are developing guidelines but no comprehensive federal framework exists yet. The RAI Institute is working on responsible robotics deployment standards. Key concern for Europe: Overly cautious regulation could cede market leadership to China, while inadequate standards could lead to safety incidents that damage public trust. The Hill Dickinson analysis highlights the "new era of risk" that current legal frameworks are unprepared for. Expected timeline for comprehensive frameworks: 2027-2029. [67][84]
Confidence: Medium — regulatory timelines are inherently uncertain and heavily influenced by political dynamics and potential safety incidents
Recent News
Coming soon — will be updated with latest developments
Sources: [67] [84]
Directional Impact on Probabilities
→ S3: Remaining challenges (battery, dexterity) solved; mass deployment exceeding 50,000+ annual units. → S2: Continued scaling + cost reduction + high success rates in production. → S1: Major safety incidents or deployment failures slowing trajectory. Current assessment: Chinese manufacturers at 99%+ success rates, ROI of 1,400–2,070%, and sim-to-real transfer now solved for production — strongly supports S2/S3.
How drivers interact

The four drivers are not independent — progress on one can accelerate or enable progress on others. The perspective that materializes depends not just on individual drivers but on whether compounding effects emerge from their interactions.

Architectures → Agentic AI: Better reasoning and planning architectures directly improve agent reliability and task scope.
R&D Automation → Architectures: AI-assisted research accelerates the development of new architectures and training techniques.
Agentic AI → R&D Automation: More capable agents can automate more of the research process, creating recursive improvement loops.
Robotics ↔ R&D Automation: Sim-to-real success generates deployment data that improves AI models; better AI models improve robotic capabilities.

Current observation: Compounding effects are beginning to materialize — R&D automation (Claude Code) is accelerating architecture and agent development; improved architectures enable better agents; better agents contribute to R&D automation. This recursive dynamic, speculative 12 months ago, now appears operational.

Probability Assessment

Consolidating evidence from all four drivers into perspective probability estimates.

S1: 5%
S2: 50%
S3: 45%
5%
S1: Plateau
Capability stagnation. Driven by diffusion of current systems.
50%
S2: Continued Pace
Steady gains. Agentic workflows mature. Most defensible baseline.
45%
S3: Accelerated
Compounding gains. Expert-supported. Feedback loops materializing.

Why S1 is only 5%

All four core drivers show strong, measurable progress. No serious AI researcher predicts stagnation. Current investment levels ($600B+ in 2026) ensure continued advancement. Even efficiency gains alone would produce meaningful capability improvements.

Why S2 is 50%

The most defensible baseline extrapolation from current trends. Assumes continued engineering gains without paradigm shifts. Consistent with historical patterns of steady improvement. Accounts for both technical progress and adoption constraints.

Why S3 is 45%

Supported by expert timelines from lab leaders. Feedback loops between AI and AI research are beginning to materialize. Architectural innovation (nested learning, causal AI) could unlock step-changes. Investment levels suggest conviction in acceleration.

Why S2 and S3 are nearly equal

There is a 95% probability that either S2 or S3 materializes. The key uncertainty is not whether significant progress occurs, but how fast. Decision-makers should prepare for both trajectories simultaneously.

Benchmark Observatory

Current state-of-the-art performance across key AI benchmarks, with visual progress indicators.

Breakthrough Triggers

Observable events that would trigger a re-estimation of perspective probabilities.

Trigger Example Events Likely Shift
Architecture breakthrough Nested Learning or Causal AI deployed in production systems → S3
Agentic reliability leap SWE-bench, OSWorld, GAIA exceed 90% simultaneously → S3
R&D automation evidence Multiple labs report majority of research code AI-written → S3
Robotics at scale 50,000+ humanoid robot deployments in commercial settings → S3
Major safety incident Significant harm from AI system failure in critical application → S1
Reliability regression Persistent, unfixable failure modes across frontier models → S1
AGI claims Credible AGI announcement by major lab with demonstration → S3
Geopolitical disruption Chip supply conflict, export controls escalation Context-dependent
Help us improve this framework

Your expertise matters

This perspective analysis is a living document. We actively seek input from domain experts, researchers, and practitioners to keep it accurate and comprehensive. Your feedback directly shapes future updates.

Breakthrough Triggers
Should we add or remove trigger events? Are thresholds set correctly?
📊
Benchmarks to Track
Are we watching the right benchmarks? Which ones should we add or deprioritize?
📡
Signals to Monitor
What signals are we missing? Which current ones need recalibration?
Share your feedback

Email us at ai-scenarios@appliedai-institute.de or reach out to your appliedAI contact directly.

Perspective Impact Explorer

How AI advancement affects 10 key domains across Europe — from labor markets to local institutions. Each impact category shows opportunities, risks, and concrete descriptions of what life looks like in each perspective.

Impact Severity Matrix

Overview of impact severity across all categories and perspectives. Click any category below to explore in detail.

Category S1: Plateau S2: Continued Pace S3: Accelerated
Labor Market & Skills Moderate High Severe
Public Finance & Social Systems Manageable Significant Crisis
Industry & Competitiveness Moderate High Severe
Innovation & Startups Moderate High Severe
Science System Mixed High Severe
Security & Resilience Elevated High Severe
Digital Public Sphere High Severe Catastrophic
Health & Care Positive Strongly Positive Transformative
Education System Significant High Severe
Local Institutions Moderate High Severe
Detailed Impact Explorer

Expand each category to explore key evidence, perspective-specific impacts, and visions for positive outcomes.

Labor Market & Skills

Key Evidence

6-20%
entry-level employment decline in exposed roles [90]
90%
of code at Anthropic written by AI [151]
67%
measured productivity increase with AI tools [36]
$500K+
new ARR-per-employee benchmark (up from $200K) [103]
20%
software developer job posting decline [87]

Entry-level knowledge workers are already experiencing 6-20% employment decline in exposed roles. Most entry-level analyst positions now require demonstrated AI proficiency. Senior professionals are splitting into two groups: those who effectively leverage AI tools see dramatic productivity gains, while those who resist face growing pressure. The transition is happening faster than institutional reskilling programs can adapt. [90][91]

Impact Matrix

Impact S1: Plateau S2: Continued Pace S3: Accelerated
FIRST-ORDER IMPACTS
Task automation in knowledge work Continues via adoption of current capabilities Expands to multi-step workflows; software, analysis, back-office substantially automated Rapid expansion across most knowledge professions; many roles fundamentally changed
Entry-level displacement Meaningful (6–20% in exposed roles); firms hire fewer juniors Intensifies; junior hiring declines across more sectors Severe; entry pathways into many professions disrupted
Productivity dispersion Gap widens between AI-adopting firms and laggards Gap becomes structural competitive disadvantage Winners pull far ahead; some organizations cannot adapt fast enough
Wage dynamics Polarization begins; premium for AI-complementary skills AI-augmented roles command significant premiums; routine knowledge work wages stagnate Potential wage collapse in automatable roles; concentration of gains
SECOND-ORDER IMPACTS
Career pathway disruption Junior roles become scarcer; progression models weaken Traditional career ladders erode; mid-career reskilling becomes essential Career structures fundamentally disrupted; continuous reskilling required
Talent competition Intensifies for AI-skilled workers Becomes acute; AI talent commands extreme premiums AI expertise becomes dominant hiring criterion across sectors
Regional employment effects Concentrated in knowledge-work hubs Spreads to broader geography; SME employment affected Systemic regional effects; some regions face structural unemployment
Social cohesion Strains visible in affected cohorts Strains broaden; public debate intensifies Risk of social instability if transitions are poorly managed

What This Looks Like

S1: Plateau
S2: Continued Pace
S3: Accelerated
Moderate disruption. Recent graduates find most analyst positions now require AI proficiency. AI handles data gathering and preliminary analysis, transforming entry-level roles from execution-focused to judgment-focused. Job transformation rather than mass displacement characterizes this perspective. Career progression slows for younger workers as mid-level tasks get compressed. Wage stagnation appears in some professions but overall market stability is maintained. Senior professionals who adapt gain meaningful productivity advantages; those who don't face increasing pressure. Reskilling programs exist but are often underfunded and poorly matched to actual employer needs. [90]
Significant restructuring. Entire job categories begin to shrink — accounting, legal research, marketing, software testing. Mid-career professionals face genuine retraining or wage decline pressure. AI coding tools write the majority of production code, with humans shifting to architecture, review, and strategic decisions. Those who successfully transition to AI-augmented roles see productivity and compensation rise significantly. Companies reorganize around smaller human teams augmented by AI agents, with the new benchmark of $500K+ ARR per employee (up from $200K historically). Unemployment insurance systems face elevated claims in certain regions and demographics. The 'few-person unicorn' phenomenon becomes a reference point: companies like Lovable achieve $150M ARR with just 60 employees. [90][103][104]
Severe disruption within 2-3 years — a transformation that typically takes decades. The majority of entry-level and many mid-level cognitive roles become redundant. Unemployment spikes in exposed sectors and demographics. Career trajectories that previously spanned decades are disrupted within years. Society faces a crisis of purpose and identity for millions of workers whose skills become obsolete faster than they can retrain. Agentic AI systems handle not just individual tasks but entire multi-step workflows, replacing team-level functions. The speed of change outpaces all institutional capacity to reskill workers. Proactive policy becomes essential: reformed education, updated social contracts, new approaches to distributing the value AI creates. [90][91][93]

What If Done Right

A more productive, flexible labor market where AI handles routine cognitive work and humans focus on creative, interpersonal, and strategic tasks. Continuous learning platforms enable smooth career transitions. New forms of human-AI collaboration create roles that didn't previously exist. The productivity gains from AI are broadly shared through updated tax frameworks and social safety nets, ensuring that technological progress translates to improved living standards for all, not just those at the top. Europe's strong social partnership traditions become an asset in managing this transition.

Public Finance & Social Systems

Key Evidence

50-70%
of government revenue from income/payroll tax (vulnerable base) [92]
40-60%
payroll decline in some sectors under S3 [92]
2,070%
5-year baseline ROI on humanoid robots [57]
4.5%
GDP uplift potential from AI adoption [152]
33%
financial aid fraud detected by AI tools [149]

AI-driven productivity changes create a fiscal paradox: increased economic output alongside reduced labor-based tax receipts, while transition support demand rises simultaneously. Finance ministries are observing modest but measurable shifts in tax revenue composition. The core challenge is that 50-70% of European government revenue comes from income and payroll taxes — precisely the base that AI-driven labor displacement erodes. [92][95]

Impact Matrix

Impact S1: Plateau S2: Continued Pace S3: Accelerated
FIRST-ORDER IMPACTS
Tax base effects Minor shifts; labor income taxes remain primary Meaningful erosion of payroll taxes in affected sectors Potential fundamental restructuring needed
Unemployment insurance claims Elevated in affected cohorts; manageable Meaningful increase; system stress in some regions Severe stress; system may require fundamental redesign
Retraining expenditure needs Increased but absorbable Substantial increase; capacity constraints emerge Massive investment required; exceeds current institutional capacity
Fiscal efficiency gains Modest improvements from AI in government Meaningful savings from automation of public administration Substantial efficiency possible if governance capacity keeps pace
SECOND-ORDER IMPACTS
Public service demand Increased demand for transition support Demand rises faster than capacity; backlogs grow Demand spike across employment, training, social services
Fiscal sustainability Manageable with adjustments Requires proactive tax base adaptation May require fundamental rethinking of social contract
Pension system effects Minimal Some pressure from workforce composition shifts Significant if employment structure changes fundamentally
S1: Plateau
S2: Continued Pace
S3: Accelerated
Manageable adjustments. Modest shifts in tax revenue as AI-augmented firms become more productive while some sectors reduce headcount. Unemployment insurance claims tick up in specific regions and demographics but remain manageable within existing frameworks. Government AI initiatives improve tax collection efficiency and fraud detection, partially offsetting revenue pressures. Policy debate focuses on incremental adjustments — modest training program funding, small tax policy shifts. The fiscal impact is a slow erosion rather than a crisis, giving policymakers time to develop responses. [92][95]
Significant pressure mounting. Treasury departments begin serious planning for tax base transitions. Payroll taxes decline measurably in multiple sectors as companies reorganize around smaller, AI-augmented teams. Active discussions emerge about consumption taxes, AI-specific levies, and corporate tax adjustments. Unemployment insurance systems in some regions face genuine stress, prompting emergency funding and eligibility adjustments. Government retraining programs struggle with capacity and relevance. 'Robot taxes' and universal basic income move from academic discussion to serious political consideration. The fiscal equation becomes strained: falling revenue from labor plus rising transition costs, partially offset by AI-driven efficiency gains in public service delivery. [92][95][97]
Fiscal crisis potential. Tax revenues from labor decline substantially — some sectors lose 40-60% of payroll within 2-3 years. Demand for unemployment insurance and transition support surges simultaneously. Public finance requires fundamental restructuring: new revenue models tied to AI-generated value rather than human labor. Concepts like universal basic income become practical necessities rather than theoretical research topics. The silver lining: public services themselves become dramatically more efficient and accessible through AI augmentation, reducing costs while improving quality. The net fiscal impact depends critically on whether governments can capture a fair share of AI-generated productivity gains through updated tax frameworks. Europe's VAT-heavy tax systems may prove more resilient than payroll-dependent ones. [92][93][95]

What If Done Right

Broad-based prosperity through new tax frameworks that capture AI-generated value regardless of whether it's produced by humans or machines. Efficient public services that deliver more with less. Adaptive social safety nets that support career transitions rather than just cushioning losses. Europe's strong social model, updated for the AI era, becomes a global reference point for managing technological transition equitably.

Industry & Competitiveness

Key Evidence

4.8-5%
EU share of global high-end AI compute [100]
13.48%
EU enterprise AI adoption rate (vs 41% globally) [101]
$1T
market cap swing in Feb 2026 "AI Scare Trade"
$150M ARR
Lovable: unicorn with 60 employees [104]
$2-5M
AI-native revenue per employee (vs $400-500K traditional) [103]

The February 2026 'AI Scare Trade' triggered sharp stock sell-offs in sectors vulnerable to AI automation — a market signal that investors already see AI disruption as imminent. Meanwhile, the EU's structural disadvantage is stark: only 4.8-5% of global high-end AI compute, 13.48% enterprise adoption rate, and only 4% 'advanced' AI adopters among EU enterprises. The SaaS business model is being fundamentally disrupted: traditional per-seat pricing is transitioning to transaction-based models as AI automates workflows, and natural language interfaces make traditional dashboards obsolete. [100][101]

Impact Matrix

Impact S1: Plateau S2: Continued Pace S3: Accelerated
FIRST-ORDER IMPACTS
Productivity divergence Gap widens between AI-leaders and laggards Gap becomes structural competitive disadvantage Leaders pull decisively ahead; laggards face existential risk
Sector transformation Uneven; highest in software, professional services Broad transformation; manufacturing, services significantly affected Rapid transformation across sectors; traditional competitive advantages erode
SME viability Challenges in accessing AI capability Significant barrier; risk of SME consolidation Existential pressure on SMEs unable to adopt
Supply chain exposure Dependencies persist; manageable Dependencies become strategic vulnerability Critical dependency on non-European AI infrastructure
Business model disruption Fee-based intermediaries under early pressure Traditional SaaS and brokerage models face restructuring Fundamental repricing of labor-intensive service businesses
SECOND-ORDER IMPACTS
European competitiveness Gradual erosion relative to US, China Significant competitive gap opens Risk of structural decline in key sectors
Employment structure Shifts within sectors; gradual Significant sectoral employment shifts Potential rapid restructuring of industrial employment
Investment patterns AI investment increases; concentration continues AI becomes dominant investment priority Massive capital reallocation toward AI-enabled businesses
Market valuations Volatility in AI-exposed sectors Structural repricing of intermediary businesses Winners and losers clearly separated by AI capability
S1: Plateau
S2: Continued Pace
S3: Accelerated
Moderate pressure. European manufacturing companies implement AI quality control, achieving measurable improvements in defect rates and throughput. Competitive advantages remain regional; global gaps narrow only slightly. European strengths in specific domains — precision manufacturing, chemicals, automotive engineering — persist and are enhanced by AI integration. Automation pressures affect labor-intensive sectors but are not transformative at the overall economy level. Incumbent firms maintain competitive position through incremental AI adoption, but the gap with U.S. and Chinese AI-native firms widens slowly. [100][101]
Widening competitive gap. Pressure intensifies dramatically as AI-native companies demonstrate fundamentally different economics. Traditional European industrial companies face growing competitive pressure from AI-first organizations operating with 4-10x higher revenue per employee. The 'few-person unicorn' becomes a competitive reference point — Lovable reaching unicorn status in 8 months with 60 employees illustrates the new speed of company building. Digital-native companies with AI-first cultures outpace traditional industrial firms. EU's compute and model dependency becomes a genuine strategic vulnerability. Enterprise consolidation accelerates as smaller firms struggle with AI transition costs. Regional competitive gaps within Europe also widen as some ecosystems adapt faster than others. [100][101][103][104]
Structural disruption within 18-36 months. Industrial structure transforms at unprecedented speed. Some traditional European competitive strengths become obsolete as AI commoditizes domain expertise. New AI-native competitors emerge and scale to significant revenue within months. Incumbent disadvantage becomes structural: legacy systems, organizational inertia, and labor cost structures that cannot compete with AI-augmented lean organizations. Entire sectors restructure — not over decades as in previous industrial transitions, but within 1-3 years. Global competitiveness gaps become chasm-like. Europe's 4.8% share of global AI compute represents a critical strategic vulnerability. [100][101][102]

What If Done Right

European industry leverages AI to enhance its existing strengths: precision manufacturing, sustainability standards, complex engineering, and deep domain expertise. New AI-native companies emerge alongside transformed incumbents, creating a dynamic competitive ecosystem. Europe's regulatory clarity and high trust environments become competitive advantages for AI deployment in sensitive sectors. Strategic investment in EU compute infrastructure and talent closes the capability gap.

Innovation & Startups

Key Evidence

$7M ARR
ArcAds: 5 employees, 1 year [108]
8 months
Lovable: $0 → unicorn [104]
$500K+
new ARR/employee benchmark [103]
4-10x
AI-native vs traditional revenue/employee

The 'few-person unicorn' phenomenon is reshaping innovation economics fundamentally. Historical precedents like Instagram ($1B acquisition with 13 employees) and WhatsApp ($19B with ~50 employees) were exceptional cases — AI threatens to make them the standard. ArcAds scaled to $7M ARR in one year with just 5 employees. Lovable reached unicorn status (€200M Series A) just 8 months after launch. Some companies now operate entire functional areas with AI agents and minimal human oversight, achieving zero-FTE departments. The SaaS benchmark has shifted from $200K ARR per employee to $500K+ — and AI-native companies regularly exceed $2-5M. [103][104][108]

Impact Matrix

Impact S1: Plateau S2: Continued Pace S3: Accelerated
FIRST-ORDER IMPACTS
Startup economics Revenue per employee rises; competitive pressure increases Few-person unicorns become viable; traditional startup model challenged Company formation radically transforms; single-person billion-dollar companies possible
IP and defensibility Some compression of competitive moats Rapid IP depreciation; new models quickly obsolete existing ones Near-zero defensibility for pure software; value shifts to data and relationships
VC landscape Increased competition; many can copy successful models Structural shift: lower capital requirements, higher competition Traditional VC model challenged; new financing structures emerge
Software economics Software development costs decline Software becomes commodity; value shifts to integration "Throw-away software": users prompt what they need when they need it
SECOND-ORDER IMPACTS
Employment in startups Growth in AI-augmented roles; some displacement Dramatically fewer employees needed per unit revenue Startup employment as historically understood may largely disappear
European startup ecosystem Pressure to match AI-native efficiency Historic European capital disadvantage becomes less relevant European startups can compete globally with minimal capital
Tools vs. platforms General AI tools from major providers compete with niche startups Platform economics dominate; niche products struggle Tools for agents vs. tools for humans becomes key distinction
Wealth concentration Gains increasingly concentrated among founders Extreme concentration as few-person companies capture massive value Unknown; may require policy intervention
S1: Plateau
S2: Continued Pace
S3: Accelerated
Moderate changes. A European SaaS startup that previously needed 50 employees for $10M ARR now achieves the same with 20-30 employees. Venture capital dynamics remain largely intact but shift toward valuing AI integration capability. Startup success increasingly depends on AI fluency, but traditional patterns of fundraising, team building, and scaling persist. IP value erodes gradually as AI democratizes certain capabilities. Some new opportunities emerge in AI tooling but the fundamental startup ecosystem structure holds. [103][108]
Rapid transformation. The startup landscape restructures fundamentally. Companies like Cursor and Lovable demonstrate 'AI-native' founding principles — building products at 10x the speed and 1/10th the team size of traditional approaches. Traditional hiring curves compress from 5 years to 18-24 months. VC models shift as capital requirements decrease, lowering barriers to entry but also making differentiation harder. European founders face both opportunity (lower barriers) and threat (global competition intensifies as 'few-person unicorns' become feasible everywhere). Consolidation pressure increases on mid-market companies squeezed between AI-native startups and transformed enterprises. [103][104][105]
Paradigm shift. The concept of 'startup' as historically understood begins to dissolve. Individual technical founders can build billion-dollar-value companies with AI functioning as co-founder, CTO, and engineering team simultaneously. Venture capital models face disruption as the capital required to build significant products drops dramatically. Barriers to entry collapse across industries; competition intensifies to extreme levels. Startup success becomes more about speed of iteration and market capture than about capital raising or team building. IP becomes perishable — a competitive advantage today can be replicated by an AI-augmented competitor within weeks. The entire innovation ecosystem reshapes around AI-native creation, with implications for how Europe funds, supports, and regulates new ventures. [102][103][109]

What If Done Right

Europe becomes a hub for AI-enhanced innovation, leveraging its strong research base, regulatory clarity, and social trust infrastructure to attract AI-native startups. New support mechanisms emerge for solo and small-team founders. The democratization of company-building means more European innovators can compete globally without needing to relocate to Silicon Valley. EU programs evolve from traditional incubators to AI-native accelerators that help founders leverage AI from day one.

Science System

Key Evidence

200 → 2
Aletheia: 200 solutions filtered to 2 genuinely novel [116]
months → min
AlphaFold: structural biology transformation [112]
53
NeurIPS 2025 papers with hallucinated citations [113]
5-10x
industry vs academia AI researcher compensation gap

The Erdős Problem case study illustrates AI's current science capability precisely: Google DeepMind's Aletheia generated 200 candidate solutions to 700 open mathematical conjectures. After expert filtering, 63 were correct, 13 were 'meaningfully correct,' and only 2 were genuinely novel — demonstrating the 'O-ring automation' pattern where AI massively speeds up generating candidates but skilled human judgment remains essential for identifying what truly matters. Andrew White (FutureHouse) captures the core limitation: AI currently lacks 'scientific taste' — the expert intuition that identifies which of 10,000 novel discoveries is truly significant. Meanwhile, the brain drain from academia to industry (5-10x compensation gap) is hollowing out academic AI research capacity. [112][113][116]

Impact Matrix

Impact S1: Plateau S2: Continued Pace S3: Accelerated
FIRST-ORDER IMPACTS
Research acceleration Meaningful in data-rich domains (biology, materials) Broad acceleration; AI becomes standard research tool Transformative; AI contributes to hypothesis generation and experimental design
Scientific integrity Significant challenges; hallucinations and fraud increase Crisis mode; verification systems strained Fundamental challenge to reliability of scientific record
Research concentration AI-enabled research clusters around established problems Narrowing intensifies; novel research directions underfunded Risk of significant narrowing of scientific scope
Talent distribution Brain drain from academia to industry continues Accelerates; academic research capacity strained Critical shortage of academic AI research talent
AI research contribution AI generates candidates; humans filter and validate AI begins contributing to experimental design AI systems contribute substantially to hypothesis generation
SECOND-ORDER IMPACTS
Knowledge reliability Erosion begins; verification costs rise Trust in published research weakens May require fundamental redesign of scientific publishing
Research equity Gaps widen between well-resourced and other institutions Concentration of research capability in few institutions Risk of research becoming viable only in elite settings
Innovation pipeline Accelerated in AI-enabled domains Broad acceleration but with integrity concerns Potentially transformative but dependent on integrity solutions
S1: Plateau
S2: Continued Pace
S3: Accelerated
Mixed but manageable impact. AI tools like AlphaFold transform specific research domains — structural biology went from months of work to minutes — without eliminating the need for biological expertise. Drug discovery acceleration is moderate; clinical trials and regulatory requirements still limit overall timelines. New roles emerge around AI oversight and validation. Published research increases in volume, raising quality control questions but within manageable bounds. The integrity challenge (hallucinated citations, synthetic data) is real but addressable with updated review processes. Brain drain from academia to industry continues but stabilizes as institutions adapt compensation and working conditions. [112]
Significant transformation of research practice. AI becomes essential infrastructure — labs without AI capabilities face competitive disadvantage in publication speed and discovery rate. Breakthroughs in drug discovery, materials science, and fundamental physics accelerate significantly. Scientific publishing struggles with the volume of AI-assisted research; peer review systems become genuinely strained. Integrity challenges intensify: hallucinated references (53 found at NeurIPS 2025 alone), synthetic data indistinguishable from real, and difficulty attributing contributions between human and AI. Career trajectories for scientists shift as AI handles routine analysis, experiment design, and literature review. Research inequality grows between AI-equipped institutions and under-resourced ones. [112][113][115]
The most fundamental transformation since the scientific revolution. AI systems conduct the majority of experimental design, data analysis, and hypothesis generation. Human scientists shift to strategic oversight, fundamental question-setting, and validation of AI-generated insights. The discovery rate accelerates dramatically — but with serious authenticity questions. The Aletheia case study scales: AI generates thousands of candidate discoveries, but separating genuine insights from 'mathematically meaningless' results requires ever more sophisticated human judgment. Scientific institutions face an identity crisis: are they research engines or human thought leadership organizations? International scientific cooperation is challenged by competitive AI dynamics. Publication rates become meaningless as a metric; new evaluation frameworks are needed. [113][115][116]

What If Done Right

AI accelerates scientific discovery across disciplines while robust integrity infrastructure ensures trustworthy results. European research institutions leverage AI to punch above their weight globally, using their strong tradition of fundamental research and cross-disciplinary collaboration. New 'AI-native' research methodologies emerge that combine AI's tireless exploration with human scientific taste and judgment. Open science principles ensure AI-generated discoveries benefit all of humanity.

Security & Resilience

Key Evidence

400%
increase in AI-enhanced phishing attacks [117]
80-90%
AI-executed in first autonomous cyberattack
26%
vulnerability rate in OpenClaw agent deployments [33]
98%
threat detection rate with AI-powered defense
52%
of internet content now machine-generated [124]

AI amplifies both offensive and defensive cybersecurity capabilities, but the asymmetry between attack and defense is intensifying. The speed of AI-enabled threats increasingly exceeds human response times. Phishing attacks have become dramatically more convincing with AI-generated personalization, while malware adapts to defenses in real-time. The emergence of agentic AI frameworks like OpenClaw creates entirely new attack surfaces — Cisco's research identifies these system-integrated agents as a "security nightmare" when deployed without proper guardrails. Europol warns that AI-powered threats represent a qualitative shift, not just a quantitative one. [117][118][119][123]

Impact Matrix

Impact S1: Plateau S2: Continued Pace S3: Accelerated
FIRST-ORDER IMPACTS
Cyber offense capability AI-enhanced phishing and malware; volume increases Autonomous attack tools become common; attack surface expands AI agents conducting sophisticated attacks with minimal human oversight
Cyber defense capability AI improves detection and response; arms race continues Defense automation advances but may lag offense Critical test of whether defense can keep pace
Critical infrastructure risk Elevated; targeted attacks more feasible Significant; cascading attacks on interconnected systems Systemic risk; potential for AI-enabled attacks on multiple systems
Fraud and identity crime Sophisticated deepfake fraud increases Multi-step, AI-coordinated fraud at scale AI fraud agents operating autonomously
Agentic system vulnerabilities Emerging concerns (OpenClaw: 26% skill vulnerability rate) Enterprise agentic deployments face privilege escalation risks Agentic AI becomes both attack vector and attack surface
SECOND-ORDER IMPACTS
Insurance and liability Premiums rise; coverage gaps emerge Fundamental reassessment of cyber risk models May exceed insurability in some domains
Trust in digital systems Strained; verification costs rise Trust deficit affects digital economy Potential need for fundamental redesign of digital trust architecture
Geopolitical stability AI capabilities integrated into state competition Escalation risks from AI-enabled operations Significant stability risks from autonomous systems

What This Looks Like

S1: Plateau
S2: Continued Pace
S3: Accelerated
Elevated but manageable threat landscape. Cybersecurity teams rely increasingly on AI for threat detection and faster response times. Attackers also use AI — phishing emails become more convincing, malware adapts more quickly to defenses. A mid-sized company experiences an AI-enhanced ransomware attack where AI maps networks and identifies high-value targets automatically. Critical infrastructure operators implement additional AI monitoring but are constrained by budgets and expertise. The cat-and-mouse dynamic intensifies without fundamental change. Enterprise deployments of agentic frameworks (OpenClaw) create new attack surfaces that security teams must manage, with a 26% vulnerability rate in analyzed deployments. [117][123]
Fundamental transformation of the threat landscape. Attackers deploy AI agents that autonomously probe systems, adapt to defenses, and coordinate multi-vector attacks. The first major AI-vs-AI cyber conflict occurs between state actors — the speed of attack and defense exceeds human intervention ability. Critical infrastructure attacks become more frequent and sophisticated; a significant event affecting power grids, water systems, or financial infrastructure causes widespread disruption. Cyber insurance markets convulse as traditional risk models fail to account for AI-amplified threats. Organizations without heavy AI defense investment become dangerously exposed. The security implications of agentic AI with system-level access become a major enterprise concern — prompt injection, privilege escalation, and supply chain attacks on agent tool chains represent entirely new threat categories. [117][120][121][122][123]
Autonomous cyber weapons become the norm. AI systems conduct sophisticated, multi-month campaigns with minimal human oversight — identifying vulnerabilities, developing exploits, establishing persistence, and exfiltrating data or disrupting operations. Attribution becomes extremely difficult; the distinction between state and non-state actors blurs as AI capabilities democratize offensive operations. Critical infrastructure faces existential vulnerability as backup systems can be overwhelmed by coordinated AI-driven attacks. Cyber conflicts escalate geopolitical tensions as the speed of attack outpaces diplomatic response. Defensive moats become temporary — any security advantage is identified and countered within hours rather than months. The entire cybersecurity paradigm shifts from perimeter defense to assume-breach-and-contain models. International norms and treaties on AI weapons become urgent priorities. [117][118][121][122]

What If Done Right

AI-powered defense stays ahead of AI-powered offense through coordinated European cyber defense infrastructure. Shared threat intelligence platforms enable real-time response across member states. European cybersecurity capabilities become a global asset, with the EU's regulatory framework (NIS2 Directive, AI Act) providing a foundation for responsible AI-era security practices. Resilient critical infrastructure withstands AI-era threats through defense-in-depth strategies that assume AI-capable adversaries. International cooperation establishes norms for AI weapons and autonomous cyber operations.

Digital Public Sphere & Democracy

Key Evidence

52%
of internet content now machine-generated [124]
2x / 6mo
deepfake content doubling rate [125]
3 sec
of audio needed for 85% accurate voice cloning [126]
90%
synthetic content online by 2026 (Europol) [128]
77%
of Americans concerned about AI misinformation [127]

The digital public sphere is uniquely vulnerable because AI capabilities are already sufficient to cause severe damage — this is the only domain rated "High" even under Perspective 1. AI agents can now target and execute actions against specific individuals, companies, and countries, moving beyond passive content generation to active intervention. Voice cloning requires just 3 seconds of audio to achieve 85% accuracy. Citi Institute reports deepfake-related fraud losses exceeding $1B annually. Europol's assessment is stark: the challenge of deepfakes has moved from a technical curiosity to a law enforcement priority. [124][125][126][128]

Impact Matrix

Impact S1: Plateau S2: Continued Pace S3: Accelerated
FIRST-ORDER IMPACTS
Synthetic content prevalence High and growing; detection lags generation Majority of online content potentially synthetic Synthetic content indistinguishable from authentic at scale
Manipulation sophistication Personalized fraud and phishing increase Mass-scale personalized manipulation becomes feasible Adaptive, interactive manipulation at population scale
Information verification Challenging; institutional capacity strained Crisis in verification; traditional media models stressed Fundamental challenge to evidence-based discourse
Electoral integrity Increased risk of interference; manageable with vigilance Significant stress on electoral information environment May require fundamental redesign of electoral communication
SECOND-ORDER IMPACTS
Institutional trust Erosion continues; accelerated by AI-enabled manipulation Trust crisis in media, government, expertise Potential legitimacy crisis for democratic institutions
Public discourse quality Degraded; harder to establish shared facts Fragmented reality; echo chambers reinforced May require new models of democratic deliberation
Journalistic viability Strained; AI both threat and tool Fundamental business model stress Journalism as known may require reinvention

What This Looks Like

S1: Plateau
S2: Continued Pace
S3: Accelerated
Already high impact even at capability plateau. Deepfakes and synthetic content are already common nuisances. A politician releases a video statement; doctored versions circulate within hours, with detection lagging far behind generation. Voice-cloning scams targeting elderly relatives become routine, with financial losses mounting — McAfee's research shows only 3 seconds of audio is needed for convincing clones. Fact-checkers struggle to keep pace with the volume of synthetic content. News organizations develop AI detection tools but remain always behind generation capability. Public trust in online content erodes steadily; people retreat to personal relationships and familiar sources for reliable information. Elections proceed with heightened vigilance but narrowly avoid catastrophic AI-related incidents. "Seeing is believing" is already obsolete. [124][125][126]
Information environment reaches tipping point. More online content is AI-generated than human-created. Social media platforms struggle to enforce authenticity policies as synthetic content volume overwhelms moderation systems. A significant election somewhere is materially affected by AI-generated disinformation that spreads faster than any fact-checking infrastructure can counter. Journalists increasingly use AI for content production, raising verification and accountability questions about the news itself. Some societies experiment with "verified zones" of internet requiring authenticated identity; others resist this as threatening anonymity and free speech. Personalized disinformation becomes industrialized — campaigns can target millions of individuals with customized narratives based on their digital profiles. The concept of a "shared reality" based on common information begins to fracture. [124][125][127][128]
Catastrophic erosion of shared reality. "Authentic" digital content becomes nearly meaningless — any text, image, audio, or video can be generated or manipulated in real-time with no detectable artifacts. AI systems conduct coordinated narrative campaigns targeting specific populations, adapting messaging in real-time based on response patterns. Elections are materially affected across multiple democracies; public confidence in democratic processes erodes. Authoritarian regimes use AI to control information environments with unprecedented precision; democracies struggle to maintain open public spheres without equivalent tools. Mass-customized manipulation campaigns target entire populations simultaneously with personalized messaging. Public discourse fractures into incompatible realities where different groups operate on fundamentally different sets of "facts." The distinction between journalism and content generation effectively disappears. [124][125][127][128]

What If Done Right

Robust digital identity and content provenance infrastructure restores trust in the digital environment. Standards like C2PA (Coalition for Content Provenance and Authenticity) provide verifiable origin for media content. AI-powered verification tools help citizens navigate the information environment, providing real-time context and credibility signals. Democratic discourse is strengthened through transparency and accountability mechanisms. Europe leads globally in building trustworthy digital infrastructure, making its approach a model for democratic societies worldwide. Media literacy education becomes universal, equipping citizens to critically evaluate information in an AI-saturated environment.

Health & Care

Key Evidence

37%
mortality reduction with AI triage
82%
sepsis detection rate (2x prior methods) [131]
97% / 98%
FDA-cleared AI sensitivity / specificity [132]
70%
reduction in drug discovery timelines [134]
$188B
projected AI healthcare market by 2030 [135]

Healthcare is the most consistently positive impact domain across all perspectives — AI genuinely improves outcomes in every case, though the scale of improvement varies dramatically. The Johns Hopkins AI system detects sepsis with 82% accuracy, roughly doubling prior methods. The FDA-cleared Aidoc CARE foundation model for abdomen CT achieves 97% sensitivity and 98% specificity. Drug discovery timelines are being reduced by up to 70% through AI-accelerated molecule screening and target identification. The market is projected to grow from $14.6B (2024) to $80-188B by 2030-2036, reflecting deep conviction in AI's healthcare potential. [131][132][134][135]

Impact Matrix

Impact S1: Plateau S2: Continued Pace S3: Accelerated
FIRST-ORDER IMPACTS
Diagnostic accuracy Meaningful improvements in imaging, pathology AI becomes standard diagnostic aid across specialties AI diagnostics approach or exceed human performance in many areas
Administrative burden Significant reduction in documentation time Major transformation of clinical workflows Administrative roles substantially automated
Drug discovery Accelerated timelines for specific compounds Systematic acceleration of pharmaceutical R&D Potential paradigm shift in drug development speed and cost
Access and equity Improved access where deployed; gaps persist Broader deployment; equity depends on distribution choices Transformative potential for underserved populations — or widening gaps
SECOND-ORDER IMPACTS
Workforce transformation Augmentation of clinical staff; modest role shifts Significant workflow redesign; some roles reduced Fundamental restructuring of healthcare workforce
Liability and governance Evolving frameworks; some uncertainty Major governance challenges; liability models tested Urgent need for new regulatory paradigms
Patient trust Variable; depends on transparency and outcomes Trust becomes critical factor in adoption Public acceptance may lag capability if governance inadequate

What This Looks Like

S1: Plateau
S2: Continued Pace
S3: Accelerated
Meaningful positive impact. Radiologists work alongside AI pre-screening imaging for critical findings, catching conditions that might otherwise be missed. Documentation time drops significantly with ambient AI scribes, freeing physicians for patient care. Drug companies accelerate certain discovery phases, though clinical trials and regulatory requirements still limit overall development timelines to years. Patients in well-resourced settings benefit from earlier diagnosis and more accurate treatment recommendations. Under-resourced settings see smaller improvements due to infrastructure gaps. Healthcare workforce remains largely stable with new roles emerging around AI oversight and maintenance. The key challenge is ensuring equitable access: AI tools could widen the gap between well-resourced and under-resourced healthcare systems if deployment isn't universal. [131][132]
Strongly positive transformation across the system. AI becomes a standard clinical tool across most medical specialties. Primary care physicians rely on AI-assisted diagnosis that catches conditions they might otherwise miss — effectively giving every GP access to specialist-level diagnostic capability. Specialist expertise is augmented but also partially commoditized as AI handles routine diagnostic work. Pharmaceutical breakthroughs reach clinical trials in compressed timeframes, with AI-accelerated drug discovery reducing timelines by 50-70% for certain drug classes. Healthcare workforce discussions intensify: some roles face displacement (medical coders, routine diagnostic technicians) while others grow (AI-healthcare integrators, patient navigators, AI validation specialists). Critically, rural and underserved areas gain access to specialist-level diagnostic capability through telemedicine + AI, potentially reducing healthcare inequality. [131][133][134]
Healthcare transforms more in 3 years than in the previous 30. AI systems achieve diagnostic accuracy exceeding human specialists in most fields — not as a demonstration but as routine clinical practice. Drug development timelines compress to months instead of years for certain drug classes, with AI handling molecular design, toxicity prediction, and trial optimization. Healthcare workforce is fundamentally restructured: most administrative work and routine clinical work is automated. The model of healthcare shifts from reactive (treating disease) to proactive and preventive (continuous monitoring, early intervention, personalized prevention). New challenges emerge: accelerated drug approvals raise safety questions, AI-driven treatment recommendations may create liability issues, and equity challenges intensify as well-resourced systems achieve dramatic improvements while others lag further behind. [131][134][135]

What If Done Right

AI makes high-quality healthcare accessible to all Europeans regardless of geography or wealth. Early detection and personalized treatment plans dramatically improve outcomes for chronic diseases, cancer, and rare conditions. Healthcare workers are freed from administrative burden to focus on patient care and human connection. Drug discovery accelerates to address previously intractable diseases. Europe's universal healthcare systems become the ideal platform for equitable AI deployment, ensuring that technological progress translates to improved health outcomes for everyone, not just those who can afford premium care.

Education System

Key Evidence

62%
test score increase with AI tutoring [130]
86%
K-12 students using AI [140]
88%
UK university students using AI for assessment [138]
71%
teachers burdened verifying work authenticity [139]
80%+
students use AI for writing assignments

Education faces a uniquely urgent challenge: AI is simultaneously the most powerful learning tool ever created and the greatest threat to traditional educational assessment. AI tutoring produces dramatic learning gains (62% test score improvement in controlled studies), yet 80%+ of students already use AI for writing assignments, making traditional evaluation nearly meaningless. AI detection tools remain unreliable and adversarial. The gap between institutions that embrace AI-augmented learning and those that try to ban it is widening rapidly. CDT reports that schools' embrace of AI comes with significant risks to students including privacy, surveillance, and dependency concerns. [136][137][138][139][140]

Impact Matrix

Impact S1: Plateau S2: Continued Pace S3: Accelerated
FIRST-ORDER IMPACTS
Learning enhancement Meaningful gains for students with access Substantial personalization possible; adaptive learning standard Transformative potential for individualized education at scale
Assessment integrity Significant challenge; current detection insufficient Crisis mode; fundamental assessment redesign required Traditional assessment models potentially obsolete
Curriculum relevance Increasing gap between taught skills and labor market Gap becomes acute; rapid curriculum adaptation needed Continuous curriculum evolution required
Teacher role evolution Augmentation of routine tasks; modest workload shift Significant role transformation; focus shifts to mentorship Teachers become orchestrators of AI-enhanced learning
SECOND-ORDER IMPACTS
Inequality effects Digital divide persists; advantaged students benefit more Gap widens between AI-equipped and under-resourced institutions Risk of fundamental educational inequality
Skill development patterns Some atrophy of foundational skills in AI-dependent students Broader concerns about critical thinking and resilience May require fundamental rethinking of what education means
Teacher supply and training Retraining needs emerge Significant professional development investment required Continuous upskilling essential; teacher role redefinition

What This Looks Like

S1: Plateau
S2: Continued Pace
S3: Accelerated
Assessment crisis already materializing. A university professor finds 80%+ of students using AI for writing assignments. Detection tools are unreliable; redesigning all assessments is time-consuming and imperfect. Some departments shift to oral examinations or project-based assessment. K-12 schools vary widely: some embrace AI tutoring and see measurable learning improvements; others lack resources and continue traditional approaches. The gap between well-resourced and under-resourced schools widens noticeably. Curriculum discussions intensify about what skills students need when AI handles routine cognitive work, but reforms are slow. The fundamental challenge is manageable but requires active institutional adaptation. [136][138][140]
Educational assessment enters genuine crisis. Traditional examinations become nearly meaningless as AI produces expert-level work on demand across all subjects. Universities experiment radically: some eliminating written assignments entirely, others developing sophisticated proctored oral and practical examinations. K-12 education splits sharply: leading districts redesign curricula around AI augmentation — teaching students to use AI effectively while developing uniquely human skills (critical thinking, creativity, interpersonal communication). Trailing districts continue traditional approaches, increasingly disconnecting their graduates from labor market reality. Teacher roles transform: the best teachers become AI-augmented learning coaches, while others struggle with obsolescence. Curriculum rethinking becomes urgent: what is education for when AI can perform most routine cognitive tasks? The HEPI/Kortext survey shows 88% of UK students already using AI for assessment — this isn't a future problem, it's today's crisis accelerating. [136][137][138][139][142]
Education systems face existential questions about purpose. If AI can perform most cognitive tasks at expert level, what is education fundamentally for? Assessment methods collapse comprehensively — standardized tests, written examinations, and even many project-based assessments become meaningless measures of human capability. Educational institutions splinter into high-innovation cohorts that reinvent themselves and stagnant ones that become irrelevant. Homeschooling and alternative education models (AI-tutored, project-based, apprenticeship) proliferate. Student anxiety intensifies about the relevance of their education to an AI-transformed labor market. The education-to-employment pipeline breaks down entirely for many career paths. Global education inequity widens dramatically: students with access to AI-augmented learning advance rapidly while those without fall further behind than ever. The fundamental question becomes: education must teach what AI cannot do, but the set of things AI cannot do is shrinking rapidly. [136][137][140][141][142]

What If Done Right

Personalized AI tutoring gives every student access to world-class instruction tailored to their learning style, pace, and interests — the educational equivalent of a private tutor for every child. Assessment shifts from testing memorized knowledge to evaluating creativity, critical thinking, ethical reasoning, and human collaboration skills. Education becomes a lifelong, adaptive system rather than a front-loaded phase that ends at graduation. Teachers evolve into learning coaches, mentors, and facilitators of human development. Europe's tradition of holistic education — emphasizing civic responsibility, cultural understanding, and personal development alongside technical skills — becomes more valuable than ever.

Local Institutions & Liveability

Key Evidence

67%
of cities integrating AI in some form [146]
15 → 5-7 wks
Burlington: permit approval time reduction [144]
6%
of local governments prioritize AI [146]
77%
of citizens distrust government AI use [147]
30%
San José: staff time saved via AI upskilling [145]

Local institutions are where AI's impact is experienced most directly by citizens — through municipal services, permit processing, citizen queries, and community safety. The paradox is striking: while 67% of cities report integrating AI in some form, only 6% of local governments actually prioritize AI strategically. Burlington's AI pilot reduced permit approval from 15 weeks to 5-7 weeks; San José's AI upskilling program saved 30% of staff time. But 77% of citizens distrust government AI use (Gallup/Bentley survey), creating a significant adoption barrier. Meanwhile, AI-generated fraud targeting local governments — including voice impersonation scams against benefit programs — is increasing rapidly. [143][144][145][146][147][148]

Impact Matrix

Impact S1: Plateau S2: Continued Pace S3: Accelerated
FIRST-ORDER IMPACTS
Municipal service efficiency Meaningful gains in permitting, citizen services Broad efficiency improvements; workflow transformation Potentially dramatically streamlined government operations
Citizen access and experience Improved where deployed; 24/7 availability Significant enhancement; multilingual, accessible services Potentially transformative citizen-government interaction
Local fraud and manipulation Increased AI-enabled fraud; detection improves Significant fraud pressure on local systems Severe fraud stress; AI-vs-AI dynamic in detection
Service equity Digital divide persists; not all residents benefit Gap widens between AI-ready and under-resourced communities Risk of fundamental service inequality
SECOND-ORDER IMPACTS
Local government capacity Strained; AI adoption uneven Significant capacity gaps in smaller jurisdictions Institutional overload in communities unable to adapt
Community trust Variable; depends on transparency and outcomes Trust becomes critical factor; failures highly visible Local legitimacy at risk if AI implementation mishandled
Local economic vitality Divergence between AI-adopting and lagging communities Divergence intensifies; some communities thrive, others decline Potential for significant geographic inequality
Social cohesion Some strain in communities with visible displacement Strain increases; local support systems tested Risk of social fragmentation in rapidly changing communities

What This Looks Like

S1: Plateau
S2: Continued Pace
S3: Accelerated
Visible inequality in local government capability. A citizen in a forward-thinking city gets a smooth AI-assisted permit application: a chatbot answers accurately, documents are processed in days not weeks, and multilingual access removes language barriers. A neighboring jurisdiction remains unchanged from a decade ago. Some cities use AI for infrastructure maintenance (pothole detection, traffic optimization), while others lack the budget or expertise even to begin. Fraud targeting local governments increases with AI-generated voice impersonation, but detection tools help manage the risk. The early-adopter advantage creates frustration: citizens see what's possible elsewhere and demand equivalent service from their own local government, which may lack capacity to deliver. [144][146][148]
Dramatic gap between AI-enabled and traditional local government. Leading cities offer a transformed citizen experience: near-instant permitting, proactive service delivery (identifying and fixing issues before citizens report them), 24/7 multilingual assistance, and predictive resource allocation. Lagging jurisdictions see resident frustration intensify as expectations set by leaders aren't met locally. Local government employment shifts as AI handles routine administrative work — processing applications, answering queries, managing records. Some jurisdictions manage transitions smoothly with retraining programs; others face workforce conflicts and service disruptions. Fraud pressure intensifies: sophisticated AI-enabled schemes target benefit programs, tax systems, and emergency services. Communities that invested early in AI capacity thrive; those that didn't struggle to maintain even basic services as their most capable staff leave for AI-enabled jurisdictions or private sector. [143][144][145][146][148][149]
Local government faces potential institutional crisis. The speed of change exceeds institutional capacity in many jurisdictions. Service delivery bifurcates sharply: AI-enabled cities become model institutions delivering faster, more equitable, more responsive services than ever before. Others face service degradation or collapse as they cannot attract talent, afford systems, or adapt fast enough. Local government employment crisis deepens as transition support proves inadequate for the scale and speed of change. AI-powered fraud becomes an existential threat to some smaller jurisdictions — sophisticated schemes drain benefits, tax revenue, and emergency resources. Social cohesion challenges emerge as digital divides manifest in service quality: citizens in the same region receive dramatically different levels of government service depending on their municipality. Municipal bonds and creditworthiness of less-capable jurisdictions become problematic as fiscal pressures mount from both revenue decline and rising service demands. [143][146][147][148][149]

What If Done Right

AI-enabled local governments deliver faster, more responsive, and more equitable services. Citizens experience AI as a tool that makes their city work better — permits processed in days, infrastructure maintained proactively, services available in any language around the clock. Shared AI platforms (possibly at the state or EU level) ensure that even small municipalities can access modern capabilities without building them alone. Local economies adapt through AI-powered economic development programs that help businesses and workers navigate transitions. Europe's tradition of strong local governance becomes an asset: municipalities close to citizens can adapt AI to local needs more effectively than distant central governments.

Patterns Across Impact Categories

Cross-cutting observations that emerge from analyzing all ten impact domains together.

Adoption capacity is universally important

In all perspectives, the ability to adopt, integrate, and govern AI systems is a primary determinant of outcomes. Higher capacity → better results.

Early-career and vulnerable populations face disproportionate risk

Entry-level workers, digitally excluded populations, and under-resourced institutions bear the earliest and most severe negative impacts.

Trust and integrity are fragile

Across information, science, and governance, AI capabilities stress verification and trust systems. Failures cascade widely.

The gap widens with capability speed

In S3, all impacts intensify — both opportunities and risks. The window for orderly adaptation shrinks dramatically.

Second-order effects are substantial

Direct AI impacts cascade through systems (labor → tax base → public services → local institutions). Planning must account for these chains.

"Done right" visions are achievable

Each impact category has a positive path — but realizing it requires proactive policy, institutional adaptation, and deliberate choices about how AI benefits are distributed.

Time-Critical Impact Patterns

Impacts requiring attention regardless of perspective probability because they are already materializing or have long lead times for response.

Impact Pattern Why Time-Critical Perspectives
Entry-level labor displacement Already visible (6–20% in exposed roles); career effects compound All (S1–S3)
Software engineering transformation SWE-bench at human parity; majority of code AI-written at leading firms All (S1–S3)
Assessment integrity in education 88% of UK university students using AI; crisis already emerging All (S1–S3)
Information environment degradation 52% machine-generated content; detection lagging All (S1–S3)
SME adoption gap Structural barrier; widens without intervention S2, S3
Robotics competitiveness Chinese manufacturers at 39% global share; costs declining 20–30% annually S2, S3
Local government capacity Only 6% prioritize AI; adaptation takes years S2, S3
Scientific integrity Hallucinations in peer-reviewed venues; trust erosion underway All (S1–S3)
Startup economics transformation Few-person unicorns emerging; traditional models under pressure S2, S3
Market repricing of AI-exposed sectors "AI Scare Trade" demonstrates volatility; structural shifts underway S2, S3

No-Regret Impact Patterns

Impact patterns requiring action regardless of which perspective materializes (probability = 1 for some level of impact):

Labor market transition support AI literacy & workforce development Information integrity infrastructure Trustworthy AI governance capacity Digital equity for public services Innovation ecosystem adaptation

These no-regret patterns directly inform the measures identified in the Response Measures section below.

Help us improve impact analysis

Are we capturing the right impacts?

Impact assessment depends on deep domain knowledge. If you work in one of these sectors, your perspective is invaluable for keeping this analysis grounded and comprehensive.

🎯
Missing Impacts
Are there first- or second-order effects we are not covering in your domain?
Severity Assessment
Do our severity ratings match your on-the-ground experience? Where are we over- or underestimating?
🔗
Cross-Domain Effects
Do you see cascading effects between domains that we have not yet captured?
Share impact feedback

Email us at ai-scenarios@appliedai-institute.de or reach out to your appliedAI contact directly.

Collection of Response Measures

Measures derived from impact analysis, weighted by probability and time criticality. Distinguishing no-regret actions (needed regardless of perspective) from perspective-conditional preparations.

All Response Measures
? Every measure maps back to an impact (Chapter 3), a probability basis (perspective probabilities from Chapter 2), and time criticality (0–36 months). Measures are scored on impact potential (scale and severity of impact addressed), time criticality (window for effective action), feasibility (implementation within 36 months), and evidence basis (empirical grounding). No-regret measures (probability = 1) are needed regardless of perspective; perspective-conditional measures become critical primarily in S2/S3.
appliedAI Institute: Opportunity Fields

Four strategic areas where appliedAI Institute contributes to Europe's AI readiness.

01

Skills & Workforce Transformation

Build measurable AI competence across European professionals and institutions.

Linked measures: NR-13, NR-14

Key outputs: Skills Framework, Academy Programs, Exposure Studies

Target: Corporate professionals, public sector leaders, executives, educators

02

Trustworthy AI Engineering & Adoption

Enable responsible, effective AI deployment across organizations.

Linked measures: NR-2, NR-5, NR-15, NR-19

Key outputs: AI Act Accelerator, Engineering Playbooks, Agent-First Blueprints

Target: AI engineers, product teams, compliance officers, CTOs

03

Local Implementation & Ecosystem Building

Enable AI adoption at the local level, connecting institutions, SMEs, and startups.

Linked measures: NR-8, NR-16, NR-19, NR-20

Key outputs: Municipal Blueprints, SME Programs, Startup-SME Matching

Target: Local governments, SMEs, ecosystem builders, startups

04

Policy Handrails & Decision Support

Provide evidence and structured guidance for AI governance decisions.

Linked measures: NR-1, NR-3, SC-1

Key outputs: AI Perspectives Whitepaper, Policy Briefs, Monitoring Reports

Target: Policy makers, regulators, international bodies

Help us improve response measures

Are these the right measures for Europe?

Response measures must be actionable and grounded in institutional reality. If you work in policy, public administration, industry, or research, we want to hear whether these measures are feasible, complete, and correctly prioritised.

🛠
Feasibility Check
Are these measures implementable within existing institutional structures and timelines?
Missing Measures
Are there critical response actions that we have not yet identified?
👥
Ownership & Collaboration
Would your organisation like to co-own or contribute to any of these measures?
Share measures feedback

Email us at ai-scenarios@appliedai-institute.de or reach out to your appliedAI contact directly.

About This Study

What This Document Provides

  • Not a prediction but a structured framework for preparation across plausible futures
  • Three perspectives differentiated solely by the speed of AI capability progress
  • Evidence-based probability estimates grounded in observable technical drivers
  • Impact analysis across 10 key domains affecting European societies
  • Concrete response measures with priority and timing guidance

Intended Audience

  • Policy leaders: Focus on no-regret measures and institutional capacity building
  • Company leaders: Assess exposure across impact categories and prepare for multiple perspectives
  • Ecosystem partners: Identify collaboration opportunities in measures and opportunity fields

How to Use This Document

  • Policy leaders: Start with the Impact section, then examine no-regret measures. Use probability assessment to calibrate urgency.
  • Company leaders: Assess your organization against each impact category. Identify which perspective would most affect your sector.
  • Ecosystem partners: Review the appliedAI opportunity fields. Identify where your capabilities complement the response measures.
Update System

This document is designed to be a living analysis, continuously updated as the AI landscape evolves.

Annual Edition (Full Revision)

Complete revision of perspectives, probabilities, impacts, and measures. Major structural updates.

Quarterly Review

Update benchmarks, driver assessments, and probability estimates. Adjust measures based on new evidence.

Monthly Monitoring

Scan for breakthrough triggers. Monitor benchmark trajectories. Flag events requiring probability re-estimation.

Research Agent System

AI-assisted continuous monitoring with human oversight. Automated scanning of publications, benchmarks, and industry developments.

Update Triggers & Cadence

Update Type Cadence Scope
Annual edition Every 12 months Full document revision; all chapters reviewed and updated
Quarterly signal update Every 3 months Driver signals, benchmark data, probability re-assessment
Ad-hoc trigger response Within 2 weeks of trigger Affected driver(s) and probabilities; communication if shift >5 percentage points

Re-estimation Protocol

When a breakthrough trigger is detected: (1) Evidence review within 2 weeks — compile available evidence on the trigger event. (2) Driver reassessment — update the relevant driver's current read and trajectory. (3) Probability update — revise perspective probabilities based on updated driver assessments. (4) Communication — if probabilities change materially (>5 percentage points), publish an update explaining the change.

Signal Processing Pipeline

Raw monitoring data is processed through structured analysis: Relevance filtering (does this affect a driver or baseline condition?) → Magnitude assessment (incremental progress or potential step-change?) → Verification (corroborated by multiple independent sources?) → Implication mapping (which perspectives, impacts, or measures are affected?) → Update recommendation (does this warrant document revision?).

Versioning

Current: v1.0 (March 2026)

Format: X.0 = major annual edition | X.Y = quarterly update | X.Y.Z = minor correction

What appliedAI Institute Can and Cannot Do

To be transparent about scope: appliedAI Institute cannot directly change political frameworks, provide fiscal resources at government scale, build physical infrastructure, regulate or enforce compliance, or make binding policy decisions. However, we can provide evidence and analysis to inform decisions, develop methods and playbooks, train professionals across sectors, convene stakeholders, create reference implementations, and build communities of practice.

Glossary
Help us improve this framework

Help Us Improve This Framework

This is a living document. We welcome input on perspectives, drivers, impacts, measures, and collaboration opportunities.

ai-scenarios@appliedai-institute.de

Sources & References

All sources referenced throughout this analysis. Numbers in square brackets correspond to in-text citations.

[1] Axios — Hyperscaler Spending 2026. axios.com

[2] Deloitte — AI Adoption Challenges & Trends. deloitte.com

[6] Dwarkesh Patel — Dario Amodei Interview. dwarkesh.com

[7] Sam Altman — Reflections. samaltman.com

[9] Towards AI — Demis Hassabis at Davos on AGI. towardsai.net

[10] Medium — Demis Hassabis on 2030 AGI Timeline. medium.com

[14] GoPubby — Yann LeCun on World Models vs LLMs. gopubby.com

[16] ARC Prize — o3 Breakthrough. arcprize.org

[17] Sakana AI — The AI Scientist. sakana.ai

[19] EA Forum — Demis Hassabis on AGI Requirements. effectivealtruism.org

[20] Google Research — Nested Learning Paradigm. research.google

[21] NeurIPS 2025 — Nested Learning Poster. neurips.cc

[22] Microsoft — INAIT Collaboration. microsoft.com

[23] S&P Global — Causal AI Report. spglobal.com

[24] SWE-bench — Official Leaderboard. swebench.com

[25] Anthropic — Claude Opus 4.5 Announcement. anthropic.com

[26] Princeton — GAIA Benchmark Leaderboard. princeton.edu

[28] OSWorld — Benchmark Leaderboard. os-world.github.io

[29] METR — Measuring AI Long Task Ability. metr.org

[32] MindStudio — What is OpenClaw. mindstudio.ai

[33] Acronis — OpenClaw Security Analysis. acronis.com

[34] arXiv — LLM Code Generation Evaluation. arxiv.org

[35] PricePerToken — LiveCodeBench Leaderboard. pricepertoken.com

[36] Pragmatic Engineer — How Claude Code is Built. pragmaticengineer.com

[37] Reddit r/ClaudeAI — Dario Amodei Quote. reddit.com

[40] Reddit r/ClaudeAI — Multi-Agent Orchestration. reddit.com

[44] AgiBot — Official Press Release. agibot.com

[45] The Robot Report — AgiBot U.S. Debut. therobotreport.com

[46] UBTech — Walker S2 Mass Production. prnewswire.com

[47] Tech in Asia — UBTech Production Scale-up. techinasia.com

[48] South China Morning Post — Unitree Shipments. scmp.com

[49] XPeng — AI Day 2025 Press Release. xpeng.com

[50] Humanoids Daily — XPeng Iron Robot. humanoidsdaily.com

[51] Figure AI — Production at BMW. figure.ai

[52] SCMP — CATL Humanoid Robot Deployment. scmp.com

[53] Morgan Stanley — The Humanoid 100 Report. morganstanley.com

[54] Goldman Sachs — Humanoid Robots Analysis. uc.edu

[56] There's A Robot For That — Cost & ROI Breakdown. theresarobotforthat.com

[57] Robozaps — ROI of Humanoid Robots. robozaps.com

[58] Omdia/PR Newswire — AgiBot #1 Worldwide. prnewswire.com

[59] AI Business — First Mass Humanoid Robot Delivery. aibusiness.com

[60] Tech in Asia — Unitree Ships 5,500+ Robots. techinasia.com

[65] Robozaps — Challenges in Humanoid Robotics. robozaps.com

[67] Hill Dickinson — Humanoid Robots and the Law. hilldickinson.com

[68] AgiBot — GO-1 Platform. agibot-world.com

[71] Epoch AI — SWE-bench Bash Benchmarks. epoch.ai

[72] Sonar — SWE-bench Leaderboard Claim. sonarsource.com

[79] LiveCodeBench — Official Leaderboard. livecodebench.github.io

[80] SemiAnalysis — Claude Code Inflection Point. semianalysis.com

[82] The Cube Research — Causal AI 2026. thecuberesearch.com

[84] RAI Institute — Year of Innovation for Robotics. rai-inst.com

[102] TechCrunch — One-Person Unicorn. techcrunch.com

[104] TechCrunch — Lovable Unicorn. techcrunch.com

[105] Accelirate / BCG — Agentic AI for Enterprise. accelirate.com

[110] Science — AI Supercharged Scientists. science.org

[113] Columbia Statistics — ML Research Integrity. columbia.edu

[115] SciELO — Scientific Integrity in AI Age. scielo.org

[116] Terence Tao — Erdős Problem 126. terrytao.wordpress.com

[123] Cisco — OpenClaw Security Nightmare. cisco.com

[87] BLS — Software Developers. bls.gov

[90] Brynjolfsson, Li, Raymond (NBER, 2023). nber.org

[91] SWE-bench official site. swebench.com

[92] IMF Fiscal Affairs. imf.org

[93] Susskind — A World Without Work. oup.com

[95] World Bank — GovTech. worldbank.org

[97] Deloitte. deloitte.com

[100] Stanford AI Index. aiindex.stanford.edu

[101] Stanford AI Index Report. aiindex.stanford.edu

[103] SaaStr. saastr.com

[108] Henry Shi (Substack). substack.com

[109] OSF Digital. osf.digital

[117] KnowBe4. knowbe4.com

[118] Anthropic. anthropic.com

[119] Vectra AI. vectra.ai

[120] The Hacker News. thehackernews.com

[121] MDPI Systems Journal. mdpi.com

[122] VulnCheck. vulncheck.com

[124] Graphite Research. graphite.io

[125] Citi Institute Report. citigroup.com

[126] McAfee Research. mcafee.com

[127] Pew Research Center. pewresearch.org

[128] Europol Innovation Lab. europol.europa.eu

[130] PMC/University of Chicago Study. ncbi.nlm.nih.gov

[131] Johns Hopkins Medicine. hopkinsmedicine.org

[132] Aidoc CARE FDA Clearance. itnonline.com

[133] Becker's Hospital Review. beckershospitalreview.com

[134] Prajna AI Wisdom / Drug Discovery. medium.com

[135] AI Healthcare Market Report. prnewswire.com

[136] Faculty Focus. facultyfocus.com

[137] CDT. cdt.org

[138] HEPI/Kortext Survey. hepi.ac.uk

[139] CDT Polling. cdt.org

[140] EdWeek. edweek.org

[141] RAND Corporation. rand.org

[142] Gerlich (2025), MDPI. mdpi.com

[143] EY/SmartDev. smartdev.com

[144] Ontario Construction News. ontarioconstructionnews.com

[145] San José / GovTech. govtech.com

[146] ICMA Survey. icma.org

[147] Gallup/Bentley. bentley.edu

[148] Africanews. africanews.com

[149] ABC7 / Federal Investigators. abc7.com

[151] Fortune — 100% AI-Written Code. fortune.com

[152] Concord Coalition. concordcoalition.org

[153] Bank of America — 3 Billion Humanoid Robots Study. actuia.com