Infrastructure Economics
A complete financial breakdown of building and operating a hyperscale AI training facility capable of training frontier models.
A $732 million capital investment broken down into every component.
1,250 nodes × $220k avg. Includes NVLink switch fabric on baseboard. HBM3 memory is supply-constrained.
Dell, Supermicro, HPE markup for assembly, testing, warranty. ~$40k per node.
10× ConnectX-7 (400G) per node. 8 compute + 2 management NICs @ $12k avg.
Dual Xeon/EPYC + 2TB DDR5 + 30TB Gen5 NVMe per node. $30k combined.
AI-optimized construction costs $15-20M per MW—70% of silicon cost. The facility is now the bottleneck.
18MW data hall. Reinforced floors (3,000+ lb racks), 40ft ceilings, specialized piping for CDUs.
CDUs, manifolds, cold plates. Captures 70-80% of heat. Enables 40-100kW per rack.
Busways, PDUs, panels. Medium voltage distribution throughout facility.
Cameras, biometrics, man traps. Physical security for billion-dollar assets.
The network is part of the computer. InfiniBand is mandatory for 900 GB/s GPU-to-GPU bandwidth.
600-800 QM9700 switches @ $35k avg. 3-tier fat-tree topology for full bisection bandwidth.
20,000+ links. OSFP transceivers ($500-1000 each), AOC/DAC cables. Often underestimated.
Power availability—not land—is the constraint. Substation queue times dictate project timelines.
Dedicated 69kV→12.47kV. 18-36 month utility queue for >10MW loads. The real project bottleneck.
20 acres industrial in Phoenix @ $244k/acre. Near power and fiber routes. Zoned for industrial.
Diverse paths to two carrier hotels. 400G uplinks. Meet-me room build.
Storage is optimized for speed (TB/s), not capacity (PB). If checkpointing is slow, GPUs sit idle.
Lustre/GPFS @ $250k/PB. QLC flash with software endurance. Checkpoint 10TB in ~30 seconds.
High-IOPS SSD tier for small files. Critical for dataset loading and model weights.
50% deposit ($200M) locked for 6mo during chip lead time. At 6% cost of capital = $6M lost.
30-50% deposit required on hardware orders. Significant working capital drag.
3-year amortization (not 5). Blackwell eclipses Hopper in 2025-26, crushing resale.
10× Cat 3516 diesel generators (2MW each) @ $400k. N+1 redundancy required.
The recurring burn rate. Note: Software licensing exceeds energy costs by 3.5×—the largest OPEX surprise.