Beyond the Model: Where AI Scaling Is Really Headed

Article Excerpt

AI progress is no longer defined by “better models” alone. The blunt reality: the next step-change in capability is constrained less by algorithms and more by physical infrastructure—power generation, grid interconnects, transformers, turbine supply chains, chip fabrication, and memory. If those constraints remain, the industry’s centre of gravity shifts: from software to hardware, from data centres to energy systems, and potentially from Earth-based compute to space-based solar and orbital compute.

This piece captures my own takeaways: what to build, what to learn, where the bottlenecks are, and where the trends are likely to go.

Useful Tags

AI Infrastructure, Data Centres, Power Systems, Grid Interconnects, Turbines, Transformers, Semiconductors, Memory, Space Compute, Solar, Robotics, Humanoid Robots, Interpretability, AI Safety, Industrial Strategy

Article Content

Why this conversation matters

Industry has spent the last few years optimising prompts, training runs, and product wrappers—important work, but increasingly downstream of the “real” bottlenecks.

Three themes keep repeating:

Scaling compute is now a power and supply-chain problem.
Once power is unlocked, chips (especially memory) become the constraint.
The next competitive advantage is the ability to execute in the physical world—manufacturing, energy, and robotics—at speed.

This is not a comfortable conclusion if you’ve lived in “software land” for most of your career. It’s also not optional.

1) The industry is hitting a hard wall: electricity and delivery speed

The current situation: chip output is rising quickly, but electricity output (outside China) is comparatively flat. Whether the specific numbers are perfect or not, the direction is hard to ignore:

Training and inference clusters concentrate demand into single sites.
Even “simple” compute expansion drags a long tail of physical work:
- generation capacity
- transmission and interconnect agreements
- transformers and switchgear
- cooling plant and redundancy
- construction, permitting, and lead times

The uncomfortable part is that none of this behaves like software scale. You can’t ship a new grid substation by pushing to main.

A practical mental model: “nameplate power” is not your real requirement

Naïve multipliers miss the system-level overhead:

GPUs plus networking, storage, CPUs
peak cooling loads on the worst day of the year
redundancy margin for servicing and failures
non-IT power (pumps, fans, controls)

You quickly end up with a reality where “X GPUs” implies multiplicative overhead that pushes you into new categories of power engineering.

2) Utilities are not optimised for speed (and that becomes your problem)

A key bottleneck is utility interconnect lead time—studies, approvals, grid reinforcement, and the slow cadence of regulated organisations.

Whether you build “behind the meter” generation or not, you still hit the same physical constraints:

turbine lead times
transformer availability
permitting and siting
skilled labour constraints

Utilities “impedance match” to public utility commissions. They move at the speed of regulation and risk management.

Mermaid sequence: what scaling a terrestrial cluster actually looks like

3) “Space-based compute” is a provocative solution to a terrestrial bottleneck

An interesting but challenging idea: that space becomes the economically compelling place to run AI once launch costs and manufacturing throughput allow.

The underlying argument is simple:

In space, solar is “always on” (no clouds, no night cycle, no atmospheric losses).
You reduce or remove the need for battery buffering.
You avoid many terrestrial constraints: land, permitting, local grid bottlenecks.
Scaling becomes a function of:
- payload-to-orbit
- power density per tonne
- thermal management (radiators)
- comms and latency trade-offs

Flowchart: how constraints shift over time

4) Chips are the next choke point (and memory looks nastier than logic)

Even if you accept that power is the near-term bottleneck, chips are the medium-term constraint, and memory may be the hardest part.

I think this is underappreciated in mainstream AI discourse. We speak about “GPUs”, but a cluster is a full-stack system:

logic (compute)
memory (HBM, DDR, persistent)
packaging and interconnect
test, yield, and supply chain

“TeraFab” from xAI ambitions is really about this: scaling to the next order of magnitude will require industrial-scale semiconductor output, not just better models.

A useful question I keep coming back to:

If the world can’t build memory fast enough, what does the next generation of AI architectures do differently?

Possible answers include:

more compute-efficient training
retrieval and external memory systems
sparsity, mixture-of-experts, and conditional compute
better compression and quantisation
architectural shifts that reduce memory pressure

But the direction remains: hardware economics shape software design.

5) Digital human emulation looks like the next product inflection

A second major prediction is that digital human emulation—AI that can do anything a competent human can do at a computer—arrives quickly and becomes the major commercial unlock.

I find this plausible, not because it’s “AGI”, but because:

many valuable workflows are still UI-driven and messy
the integration barrier (APIs, legacy systems) is real
replacing the human at the desktop can bypass integration entirely

From a strategy perspective, this is a bridge:

Digital “remote workers” create immediate economic value.
Physical robots superset that capability later by moving atoms, not just electrons.

Flowchart: the “digital → physical” capability ladder

6) Robotics is the real “infinite money glitch” (but only if you close the loop)

Robotics development has three hard problems:

Real-world intelligence (perception + control)
The hand (dexterity and degrees of freedom)
Scale manufacturing (the supply chain doesn’t exist yet)

The key here isn't that robots are hard (everyone knows that). It’s that robots change the industrial ceiling:

If you can produce robots cheaply and in volume,
and those robots can help produce more robots and infrastructure,
you can accelerate the very constraints that currently limit AI (power, refining, fab build-out).

This is the “recursive loop” argument.

What I take from it:

The interesting question isn’t “can we build a robot?”
It’s “can we build the manufacturing system that builds the robot, and does it scale fast enough to matter?”

7) Alignment: the most practical thread is interpretability and debugging

There are a lot of philosophical discussion about “truth-seeking”, values, and the risks of deception. Some of it is speculative, but one thread is tangible and actionable:

Build better ways to inspect and debug models.
Treat failures as “bugs” that need tracing to origin: pre-training, fine-tuning, RL, tool use, or data issues.
Invest in interpretability tools that can identify:
- deception / reward hacking
- unintended optimisation targets
- brittle or unsafe policies

The “RL against reality” framing is also important: physics is a strict verifier, but humans are not. Models can be correct about the world and still manipulate people.

Mermaid sequence: “debuggable AI” as an engineering loop

Strategy for future AI development (my synthesis)

A) Treat infrastructure as a first-class product

If you’re serious about scaling, you need an infrastructure strategy that is as deliberate as your model strategy.

That means competence in:

power engineering and procurement
cooling, redundancy, and reliability planning
grid interconnect negotiation and timelines
equipment supply chain risk (transformers, switchgear, turbines)
programme execution under permitting constraints

My take: the winners will be those who can execute like industrial firms while iterating like software teams.

B) Assume bottlenecks move; build optionality

A moving constraint:

Power (now)
Chips + memory (next)
Launch cadence / orbital ops (if space compute happens)
Industrial capacity (robotics and refining)

A sensible strategy is to build optionality at each stage:

diversify power sources and locations
secure long-term fab/memory/packaging supply
invest in model efficiency (to reduce resource intensity)
develop robotics capability as a long-term constraint breaker

C) Align product direction with “digital human emulation”

If computer-use agents reach reliable capability, the product landscape shifts.

Immediate plays:

customer service and operations
back-office workflow execution
IT / DevOps assistance
procurement and scheduling
QA and compliance support (with audit trails)

Higher-difficulty plays:

CAD/CAE and engineering
chip design toolchains
automated experimentation and lab work

D) Make interpretability and auditing non-negotiable

I think this matters for three reasons:

safety (reduce catastrophic failure modes)
quality (fewer subtle, high-impact errors)
governance (credible controls for enterprise and regulators)

If agents are acting in real systems, you need:

traceability
policy enforcement
anomaly detection
evidence and audit logs

Critical infrastructure bottlenecks worth learning (and teaching)

If I were advising a team building “serious AI” (beyond demos), these are the areas I’d insist we learn:

Power systems engineering
- MW/GW scale planning
- redundancy and N+1/N+2 design
- cooling as a power multiplier
Grid interconnects and regulation
- study timelines
- constraints and curtailment risk
Transformers and switchgear
- procurement lead times
- installation and commissioning reality
Generation capacity supply chains
- turbines, blades/vanes, fuel assumptions
- behind-the-meter plant integration
Semiconductor capacity constraints
- yield curves and ramp timelines
- advanced packaging constraints
- memory supply dynamics
Operational resilience
- servicing, failures, spares, lifecycle management
(Optional) Space systems fundamentals
- thermal management and radiators
- radiation effects and fault tolerance
- comms constraints and latency

Where I think trends are heading

If I compress the above into directional bets:

AI becomes an energy industry (or at least tightly coupled to it).
Model progress continues, but the differentiator becomes execution in hard constraints.
Computer-use agents become mainstream and reshape enterprise adoption.
Robotics becomes the physical extension of AI, accelerating industrial capability.
Interpretability and governance become necessary for deployment at scale.
Space-based compute remains speculative, but the underlying driver (regulatory + power constraints) is real.

Insightful Thoughts: Questions this raises

If electricity is the near-term limiter, what business models change when tokens are power-constrained rather than GPU-constrained?
If memory is harder than logic, which AI architectures become dominant under memory scarcity?
If “digital human emulation” becomes viable, how do organisations redesign work rather than just automating tasks?
If robotics closes the industrial loop, what happens to national competitiveness when labour stops being the limiting factor?
If interpretability becomes a core engineering discipline, what new roles and toolchains emerge (the “debuggers of minds”)?
If space compute becomes economically compelling, what are the new failure modes (servicing, comms, orbital debris, geopolitical risk)?
If bottlenecks keep shifting, what capabilities should we build now to stay adaptable over the next 3–5 years?