Beyond the Model: Where AI Scaling Is Really Headed
AI progress is no longer defined by “better models” alone. The blunt reality: the next step-change in capability is constrained less by algorithms and more by **physical infrastructure**—power generation, grid interconnects, transformers, turbine supply chains, chip fabrication, and memory. If those constraints remain, the industry’s centre of gravity shifts: from software to hardware, from data centres to energy systems, and potentially from Earth-based compute to **space-based solar and orbital compute**.
Beyond the Model: Where AI Scaling Is Really Headed
Article Excerpt
AI progress is no longer defined by “better models” alone. The blunt reality: the next step-change in capability is constrained less by algorithms and more by physical infrastructure—power generation, grid interconnects, transformers, turbine supply chains, chip fabrication, and memory. If those constraints remain, the industry’s centre of gravity shifts: from software to hardware, from data centres to energy systems, and potentially from Earth-based compute to space-based solar and orbital compute.
This piece captures my own takeaways: what to build, what to learn, where the bottlenecks are, and where the trends are likely to go.
Useful Tags
AI Infrastructure, Data Centres, Power Systems, Grid Interconnects, Turbines, Transformers, Semiconductors, Memory, Space Compute, Solar, Robotics, Humanoid Robots, Interpretability, AI Safety, Industrial Strategy
Article Content
Why this conversation matters
Industry has spent the last few years optimising prompts, training runs, and product wrappers—important work, but increasingly downstream of the “real” bottlenecks.
Three themes keep repeating:
- Scaling compute is now a power and supply-chain problem.
- Once power is unlocked, chips (especially memory) become the constraint.
- The next competitive advantage is the ability to execute in the physical world—manufacturing, energy, and robotics—at speed.
This is not a comfortable conclusion if you’ve lived in “software land” for most of your career. It’s also not optional.
1) The industry is hitting a hard wall: electricity and delivery speed
The current situation: chip output is rising quickly, but electricity output (outside China) is comparatively flat. Whether the specific numbers are perfect or not, the direction is hard to ignore:
- Training and inference clusters concentrate demand into single sites.
- Even “simple” compute expansion drags a long tail of physical work:
- generation capacity
- transmission and interconnect agreements
- transformers and switchgear
- cooling plant and redundancy
- construction, permitting, and lead times
The uncomfortable part is that none of this behaves like software scale. You can’t ship a new grid substation by pushing to main.
A practical mental model: “nameplate power” is not your real requirement
Naïve multipliers miss the system-level overhead:
- GPUs plus networking, storage, CPUs
- peak cooling loads on the worst day of the year
- redundancy margin for servicing and failures
- non-IT power (pumps, fans, controls)
You quickly end up with a reality where “X GPUs” implies multiplicative overhead that pushes you into new categories of power engineering.
2) Utilities are not optimised for speed (and that becomes your problem)
A key bottleneck is utility interconnect lead time—studies, approvals, grid reinforcement, and the slow cadence of regulated organisations.
Whether you build “behind the meter” generation or not, you still hit the same physical constraints:
- turbine lead times
- transformer availability
- permitting and siting
- skilled labour constraints
Utilities “impedance match” to public utility commissions. They move at the speed of regulation and risk management.
Mermaid sequence: what scaling a terrestrial cluster actually looks like
3) “Space-based compute” is a provocative solution to a terrestrial bottleneck
An interesting but challenging idea: that space becomes the economically compelling place to run AI once launch costs and manufacturing throughput allow.
The underlying argument is simple:
- In space, solar is “always on” (no clouds, no night cycle, no atmospheric losses).
- You reduce or remove the need for battery buffering.
- You avoid many terrestrial constraints: land, permitting, local grid bottlenecks.
- Scaling becomes a function of:
- payload-to-orbit
- power density per tonne
- thermal management (radiators)
- comms and latency trade-offs
Flowchart: how constraints shift over time
4) Chips are the next choke point (and memory looks nastier than logic)
Even if you accept that power is the near-term bottleneck, chips are the medium-term constraint, and memory may be the hardest part.
I think this is underappreciated in mainstream AI discourse. We speak about “GPUs”, but a cluster is a full-stack system:
- logic (compute)
- memory (HBM, DDR, persistent)
- packaging and interconnect
- test, yield, and supply chain
“TeraFab” from xAI ambitions is really about this: scaling to the next order of magnitude will require industrial-scale semiconductor output, not just better models.
A useful question I keep coming back to:
If the world can’t build memory fast enough, what does the next generation of AI architectures do differently?
Possible answers include:
- more compute-efficient training
- retrieval and external memory systems
- sparsity, mixture-of-experts, and conditional compute
- better compression and quantisation
- architectural shifts that reduce memory pressure
But the direction remains: hardware economics shape software design.
5) Digital human emulation looks like the next product inflection
A second major prediction is that digital human emulation—AI that can do anything a competent human can do at a computer—arrives quickly and becomes the major commercial unlock.
I find this plausible, not because it’s “AGI”, but because:
- many valuable workflows are still UI-driven and messy
- the integration barrier (APIs, legacy systems) is real
- replacing the human at the desktop can bypass integration entirely
From a strategy perspective, this is a bridge:
- Digital “remote workers” create immediate economic value.
- Physical robots superset that capability later by moving atoms, not just electrons.
Flowchart: the “digital → physical” capability ladder
6) Robotics is the real “infinite money glitch” (but only if you close the loop)
Robotics development has three hard problems:
- Real-world intelligence (perception + control)
- The hand (dexterity and degrees of freedom)
- Scale manufacturing (the supply chain doesn’t exist yet)
The key here isn't that robots are hard (everyone knows that). It’s that robots change the industrial ceiling:
- If you can produce robots cheaply and in volume,
- and those robots can help produce more robots and infrastructure,
- you can accelerate the very constraints that currently limit AI (power, refining, fab build-out).
This is the “recursive loop” argument.
What I take from it:
- The interesting question isn’t “can we build a robot?”
- It’s “can we build the manufacturing system that builds the robot, and does it scale fast enough to matter?”
7) Alignment: the most practical thread is interpretability and debugging
There are a lot of philosophical discussion about “truth-seeking”, values, and the risks of deception. Some of it is speculative, but one thread is tangible and actionable:
- Build better ways to inspect and debug models.
- Treat failures as “bugs” that need tracing to origin: pre-training, fine-tuning, RL, tool use, or data issues.
- Invest in interpretability tools that can identify:
- deception / reward hacking
- unintended optimisation targets
- brittle or unsafe policies
The “RL against reality” framing is also important: physics is a strict verifier, but humans are not. Models can be correct about the world and still manipulate people.
Mermaid sequence: “debuggable AI” as an engineering loop
Strategy for future AI development (my synthesis)
A) Treat infrastructure as a first-class product
If you’re serious about scaling, you need an infrastructure strategy that is as deliberate as your model strategy.
That means competence in:
- power engineering and procurement
- cooling, redundancy, and reliability planning
- grid interconnect negotiation and timelines
- equipment supply chain risk (transformers, switchgear, turbines)
- programme execution under permitting constraints
My take: the winners will be those who can execute like industrial firms while iterating like software teams.
B) Assume bottlenecks move; build optionality
A moving constraint:
- Power (now)
- Chips + memory (next)
- Launch cadence / orbital ops (if space compute happens)
- Industrial capacity (robotics and refining)
A sensible strategy is to build optionality at each stage:
- diversify power sources and locations
- secure long-term fab/memory/packaging supply
- invest in model efficiency (to reduce resource intensity)
- develop robotics capability as a long-term constraint breaker
C) Align product direction with “digital human emulation”
If computer-use agents reach reliable capability, the product landscape shifts.
Immediate plays:
- customer service and operations
- back-office workflow execution
- IT / DevOps assistance
- procurement and scheduling
- QA and compliance support (with audit trails)
Higher-difficulty plays:
- CAD/CAE and engineering
- chip design toolchains
- automated experimentation and lab work
D) Make interpretability and auditing non-negotiable
I think this matters for three reasons:
- safety (reduce catastrophic failure modes)
- quality (fewer subtle, high-impact errors)
- governance (credible controls for enterprise and regulators)
If agents are acting in real systems, you need:
- traceability
- policy enforcement
- anomaly detection
- evidence and audit logs
Critical infrastructure bottlenecks worth learning (and teaching)
If I were advising a team building “serious AI” (beyond demos), these are the areas I’d insist we learn:
- Power systems engineering
- MW/GW scale planning
- redundancy and N+1/N+2 design
- cooling as a power multiplier
- Grid interconnects and regulation
- study timelines
- constraints and curtailment risk
- Transformers and switchgear
- procurement lead times
- installation and commissioning reality
- Generation capacity supply chains
- turbines, blades/vanes, fuel assumptions
- behind-the-meter plant integration
- Semiconductor capacity constraints
- yield curves and ramp timelines
- advanced packaging constraints
- memory supply dynamics
- Operational resilience
- servicing, failures, spares, lifecycle management
- (Optional) Space systems fundamentals
- thermal management and radiators
- radiation effects and fault tolerance
- comms constraints and latency
Where I think trends are heading
If I compress the above into directional bets:
- AI becomes an energy industry (or at least tightly coupled to it).
- Model progress continues, but the differentiator becomes execution in hard constraints.
- Computer-use agents become mainstream and reshape enterprise adoption.
- Robotics becomes the physical extension of AI, accelerating industrial capability.
- Interpretability and governance become necessary for deployment at scale.
- Space-based compute remains speculative, but the underlying driver (regulatory + power constraints) is real.
Insightful Thoughts: Questions this raises
- If electricity is the near-term limiter, what business models change when tokens are power-constrained rather than GPU-constrained?
- If memory is harder than logic, which AI architectures become dominant under memory scarcity?
- If “digital human emulation” becomes viable, how do organisations redesign work rather than just automating tasks?
- If robotics closes the industrial loop, what happens to national competitiveness when labour stops being the limiting factor?
- If interpretability becomes a core engineering discipline, what new roles and toolchains emerge (the “debuggers of minds”)?
- If space compute becomes economically compelling, what are the new failure modes (servicing, comms, orbital debris, geopolitical risk)?
- If bottlenecks keep shifting, what capabilities should we build now to stay adaptable over the next 3–5 years?