AI & ML Data

Roadmap: The AI data center stack

Six areas of infrastructure opportunity, from software and orchestration to grid hardware and cooling technologies.

AI is the most power-intensive workload in computing history. As of early 2026, 190 GW of hyperscale data center capacity has been announced across 777 projects. This includes ~148 GW planned, ~21 GW in construction, and ~12 GW already operational. Global data center electricity consumption is projected to more than double by 2030, and in the U.S., data centers will soon consume more electricity than all energy-intensive manufacturing combined.

Every country innovating with AI faces a new energy crisis where acceleration depends on physical infrastructure buildout. Data centers can be constructed within 12-18 months, but connecting them to the grid currently takes five to seven years. Of the 110 data center projects that were slated to come online in 2025, more than a quarter were delayed due to power, permitting, and construction constraints. As a result, hyperscalers are increasingly willing to accept the additional complexity of building and managing on-site power in exchange for certainty on capacity, timelines, and emissions.

The data center industry hasn’t experienced a major disruption for nearly 30 years, but the demand for tokens and the downstream requirement for energy infrastructure have entirely changed that dynamic. The U.S. federal government has also taken action, recognizing the prominent role this physical buildout has on national security and the country’s position in the global AI race. In April 2026, President Trump invoked Section 303 of the Defense Production Act to formally designate large-scale grid infrastructure as essential to national defense, authorizing emergency federal financing to expand the domestic supply of key components of this supply chain.

The private and public demand for AI is creating one of the largest infrastructure investment cycles of our lifetimes, and with it, a massive opening for startups. Data centers dominated built-environment venture investment in 2025, accounting for 78% of capital deployed ($4.5 billion of $5.7 billion). However, we believe the hardware and software layers that make those developments possible are still in their early innings, leaving significant room to venture into the enabling technology stack. The companies that will define the next decade of energy infrastructure will be those building the software, hardware, and systems that make electrons cheaper, faster, and smarter.

Sir Henry Bessemer, our firm’s namesake, was a British engineer whose process for mass-producing steel from molten pig iron transformed a costly, artisanal material into the backbone of industrial civilization—enabling the railroads, bridges, skyscrapers, and factories that defined the modern world. More than a century later, the infrastructure we’re funding looks different, but our goal remains to back founders building the concrete systems on which the next economy will run. This roadmap guides where we think the most durable value will be created in that buildout.

Six core areas of opportunity have captivated our early stage interest. As we explore this investment roadmap, we share our view on the current landscape of building out AI data centers and the emerging players, which include many leaders we’ve backed, including American Terawatt, Bastille Networks, Boom Supersonic, Claroty, DriveNets, DroneDeploy, Inertia, and Verse.

Six areas of opportunity in the AI data center stack

1. Permitting and site selection

Before a data center can secure power, build infrastructure, or deploy a single GPU, it needs permission from local and state authorities to operate. This includes zoning approvals, environmental regulations, utility interconnection agreements, air and water quality permits, endangered species assessments, cultural resource surveys, and more. Each permissioning process runs on different timelines and varies at the local jurisdiction. Between March 2024 and 2025, 16 data center developments were delayed or denied due to permitting restrictions, with local community pushback being a leading cause.

McKinsey estimates that over $5 billion is spent annually on complex infrastructure permitting, which represents over a third of total U.S. permitting spend. Over $1.5 trillion of infrastructure capital is currently stuck in the permitting pipeline. This demand is currently being serviced by large consulting firms, including Tetra Tech and Quanta Services, though the market is quite fragmented with more than 50,000 smaller firms, many with fewer than 15 employees. Their largely manual processes help customers tackle critical issues, analyze matrices, manage submissions, and track deficiencies.

AI-native software is now emerging across the full lifecycle, from site selection through operational compliance. Lorica is building an AI-native permit prep and execution service to automate the permitting workflow, which is the culmination of significant fragmented diligence of the physical world. By capturing optimal permitting pathway data across projects, paired with consultants, Lorica accelerates project development timelines by automating data collection, structuring, and submission while building a predictive intelligence layer that compounds with every engagement. Paces is another player in the space unifying grid, permitting, and environmental data into a single platform that helps developers identify viable sites and catch flaws before committing capital. Their customers report closing roughly three times as many deals because risks like grid constraints, zoning conflicts, and environmental issues are surfaced upfront rather than discovered late in development. More broadly, better real-world visibility through spatial modelling, drone-based surveys, and resource strain simulations can help developers anticipate opposition triggers before significant capital is committed. We will discuss this further in the section below on construction, maintenance, and labor opportunities.

What we look for: Product-focused teams that bring siloed data channels under one pane of glass, leverage context to build agents, and compress timelines to break ground. These are critical preconstruction workflows with high capital and time costs at stake. Hardening these brittle workflows will require handling complexity across multiple stakeholders with data flywheels that become more predictive over time.

2. Power generation

Powering data centers is shifting from the traditional grid to on-site generation. While grid-connected sites still account for the largest share by project count (45%), on-site generation and hybrid approaches together account for close to half of all announced capacity. This reflects the broader "Bring Your Own Power" (BYOP) movement, where data centers generate electricity on-site behind the meter (BTM) to sidestep waiting years in the interconnection queue for grid access. Approximately 50 GW of BTM gas generation projects were announced in 2025 alone. It has become today’s dominant strategy for building new AI data centers.

Many companies are capitalizing on this, rethinking how power gets generated and delivered to the rack to enable quicker time to power while ensuring reliability (i.e., intermittent versus firm power). Companies addressing this problem are often producing modular technologies that can adjust to demands and have a scalable supply chain that is resilient to shocks.

For example, Boom Supersonic, originally known for developing supersonic passenger aircraft, has adapted its jet engine core into Superpower: a 42 MW natural gas turbine purpose-built for data center power generation. Meanwhile, Arbor is building a next-generation gas turbine that uses supercritical CO₂ as a working fluid and has built-in carbon capture to provide baseload power.

Many hyperscalers are also exploring renewables, including co-locating them with batteries BTM. They serve as both massive backup power systems and a way to achieve a faster path to power. A BTM battery system is installed on the customer's property, charging from the grid during off-peak hours at a lower rate and discharging during peak hours at a higher rate—dramatically reducing demand charges while providing backup power and grid independence. Calibrant Energy has emerged as one of the dominant battery developers in the market, financing and deploying battery systems at hyperscale data center sites. Exowatt is building modular, 24/7 dispatchable solar power systems and thermal batteries that are purpose-built for AI factories.

Looking further out, Inertia, co-founded by Twilio founder Jeff Lawson, is working to commercialize the laser-based inertial confinement fusion approach pioneered at Lawrence Livermore National Laboratory and is building toward a grid-scale power plant. Fusion remains the longest-horizon bet in the power stack, but its potential to deliver clean, dispatchable baseload at the scale AI will require makes it one of the most important categories we are tracking. The companies that demonstrate credible engineering paths to commercialization in the next few years will be foundational to the energy stack of the 2030s and beyond.

On the transmission side, American Terawatt is building a private wire high-voltage direct current transmission network that connects data centers to power sources without waiting for the public grid. We delve further into transmission (another significant bottleneck in the race to get data centers online) and servicing utilities with new-age technologies in the opportunities we see for power conversion and grid hardware.

What we look for: While data centers are a powerful market catalyst, they’re one node in a much larger electrification trend. We’re excited about modular generation technologies, a low-levelized cost of energy (LCOE), and a repeatable deployment playbook across the broader electrification industry. As a result, supply chain resilience matters enormously. Power equipment has been structurally undersupplied for years, and companies that navigate manufacturing barriers or redesign their product to sidestep them through modularity or new suppliers carry lead time advantages that are difficult to replicate. Beyond generation itself, the most durable businesses will own the hardware relationship with the data center operator and layer dispatch optimization or predictive controls on top. And as the power stack converges toward NVIDIA’s 800V DC architecture, we believe these types of companies will also benefit from being embedded as core infrastructure partners.

3. Transmission, power conversion, and the middle mile of power

Every electron that reaches a data center first passes through a transformer. These are the electromagnetic devices that step voltage up or down at every node of the electrical grid—from generation through transmission to the facility level. Today, demand far outstrips supply. Transformer demand has increased 119% from 2019 to 2025, but manufacturing capacity hasn't come close to keeping pace. Lead times from incumbents like General Electric, Siemens, and Mitsubishi have stretched to as long as 5 years, up from ~1 year pre-COVID. The bottleneck is structural, as these are large devices that are highly engineered and built to order by a small number of manufacturers. Adjacent grid hardware faces the same crunch: switchgear lead times now stretch beyond 60 weeks, and high-voltage cable and breaker backlogs are widely reported as the next constraint after transformers.

The supply shortage is colliding simultaneously with a shift in rack density, changing what the legacy power delivery architecture must deliver. Cabinet power density ran 20–40 kW for the cloud era of data centers. Some of today's AI training clusters run at 500–600 kW per rack, with NVIDIA targeting 1 MW on Rubin Ultra and beyond. That density increase, combined with the industry standardizing on 800V DC architecture, means that each additional AC/DC conversion step in the legacy chain wastes energy and adds additional hardware.

We see three distinct categories of opportunity here:

Tackling the supply chain constraint—finding ways to get hardware from the grid to the campus faster than incumbents can deliver
Re-architecting the transformer and the broader power conversion chain itself
Unlocking more capacity from the transmission infrastructure that already exists in the ground

The strongest companies in each of these subcategories address both speed and structure simultaneously. For high-voltage transformers, Ayr Energy compresses lead times from the 3-5 year industry standard down to 6-12 months by manufacturing substation and transmission-class transformers and circuit breakers through contract manufacturers in India. For BTM power and medium-voltage transformers, technologies such as solid-state transformers (SSTs) provide a path to collapsing cost, complexity, and lead time simultaneously. SSTs collapse multiple stages of the legacy power chain into a single modular device. By using Silicon Carbide (SiC) semiconductors to convert medium-voltage AC directly to 800V DC, SSTs collapse multiple discrete stages of legacy power conversion hardware in a single modular device.

Beyond reducing the stack, SSTs are bidirectional and programmable, enabling native integration with battery energy storage systems (BESS), solar, and other forms of on-site generation sources. Heron Power is building Heron Link, a modular medium-voltage SST purpose-built for 800V DC power conversion and a key partner of NVIDIA's 800 VDC architecture. DG Matrix builds multi-port SSTs that accept any voltage or frequency input and are configurable for 800V DC or AC outputs, which collapses what had been 10–17 discrete power systems into a single device.

The third category sits a layer above: software and hardware that unlock more capacity from the transmission grid that already exists, without waiting for new lines to be permitted and built. Grid congestion costs U.S. electricity consumers an estimated $11.5 billion annually, and most of that capacity loss is invisible as transmission lines run conservatively below their true thermal limits because operators lack real-time visibility into actual line conditions.

GridAstra addresses this through FUSION-T, the first integrated software platform for grid-enhancing technologies. The platform models dynamic line ratings, advanced power flow controllers, and topology reconfiguration in a single environment, automatically generating control plans that re-route power off overloaded lines and onto under-utilized ones. On the hardware side, TS Conductor manufactures advanced composite-core conductors that allow utilities to roughly double the capacity of existing transmission corridors via reconductoring—avoiding the years of permitting and right-of-way acquisition that new lines require.

What we look for: Companies that solve both the supply chain constraint and the architectural one, or meaningfully expand the capacity of the grid we already have. A faster transformer that takes three years to procure is still insufficient; the winners will ship product today while scaling infrastructure to meet demand at hyperscale volumes tomorrow.

4. Software and orchestration

Generating and transmitting power are only part of the equation. The other part is determining how that power is managed, dispatched, and consumed. Data centers are becoming increasingly complex energy systems with multiple generation sources, battery storage, grid connections, and fluctuating AI workloads. Today, power management, workload scheduling, grid communication, and compliance still live in fragmented systems that are disconnected from one another, and the legacy Data Center Infrastructure Management (DCIM) platforms that were supposed to unify them (e.g., Schneider, Vertiv, Sunbird) were architected for an earlier era. We believe there’s an opportunity to rebuild this stack from the ground up around three structural shifts:

Tiered SLAs

Today's grid contracts and data center service agreements treat AI load as firm, inflexible demand. But unlike most enterprise compute, a meaningful share of AI workloads doesn’t actually need that rigidity. Training jobs can pause and resume, batch inference can shift across regions, and lower-priority workloads can absorb latency. That mismatch between contractual rigidity and underlying workload flexibility is quietly stranding enormous amounts of grid capacity.

The forward-looking market view is towards a tiered system that shifts batchable training to off-peak windows, accepts slightly longer latency on non-critical workloads, and returns capacity to the grid during stress events rather than holding it hostage to a contract. Emerald AI is an early breakout along this theme, building an orchestration layer that dynamically schedules GPU workloads to make AI data center power demand more flexible. When the grid tightens, Emerald can pause or throttle batchable training jobs, protect latency-sensitive inference, and shift compute across regions to wherever power is abundant. The Aurora AI Factory in Virginia—a partnership between Emerald, Digital Realty, NVIDIA, EPRI, and PJM—will be the first commercial proof point of what a power-flexible data center actually looks like in production.

Energy economics integrated with physical asset control

A Fortune 500 enterprise or hyperscaler running a portfolio of data centers needs to consider real estate, power prices and hedging, orchestration software, and compliance. They need to settle thousands of utility bills—hedging wholesale electricity exposure, dispatching batteries against real-time prices, accounting for renewable energy certificates, and more.

Unfortunately, most are currently doing this in spreadsheets stitched together with point solutions. Verse is building the platform that consolidates this stack: utility bill management, PPA and energy portfolio analytics, and risk and hedging in a single system. On top of this platform, Verse’s new Dispatch Intelligence product sits on top of BTM storage and makes real-time charging decisions. This integration matters because the decisions across these layers are interdependent. A long-term PPA, for example, shapes how aggressively a battery should be dispatched, just as overbilling on demand charges can affect the enterprise’s broader hedging strategy.

DCIM for AI workloads

Operators today struggle to answer these basic questions in real time: how much power is each rack actually drawing? How much thermal headroom exists before a hotspot becomes a failure? How much usable capacity is stranded by conservative nameplate ratings? Which GPUs are healthy and which are silently degrading?

We believe the next-generation winner here will combine live telemetry, predictive simulation of thermal and power flows, and AI-native anomaly detection into a system that operators actually trust to make autonomous decisions, not just dashboards that consultants interpret. The competitive set ranges from incumbents under pressure to modernize to AI-native entrants like Aravolta and Phaidra.

On the other side of the meter, utilities and grid operators face a mirror-image problem. They process interconnection requests, plan generation and transmission capacity, model load growth, file rate cases, and decide which projects advance. We believe AI-native software can deliver transformative value to utilities in:

Interconnection study automation
Capital project planning and load forecasting
Regulatory filings and rate case automation

Senpilot is a team building an AI-native operating system for utilities that bundles purpose-built agents for engineering, regulatory work, and customer service into a single platform. The utility software market has historically been one of the slowest-moving in enterprise tech. We believe AI is the catalyst that could finally break that pattern, and the companies that establish themselves now can eventually become the new system of record.

Grid planning and operations

While BTM generation enables a faster path to power, the underlying grid constraint also needs to be solved. The long-term sustainability of the AI infrastructure buildout depends on the grid’s ability to absorb new load faster, and that requires utilities themselves to adopt more technology. ThinkLabs AI, which recently spun out of GE Vernova, addresses this directly by building a physics-informed AI digital twin of the electric grid that helps utilities model power flows, congestion, and interconnection impacts at a fraction of the time required by legacy tools. This enables utilities to plan upgrades and connect new loads more efficiently.

What we look for: The connective tissue across all three shifts (tiered SLAs, integration of energy economics with physical asset control, and real-time operational intelligence over the physical facility) is software spanning the IT stack and physical stack simultaneously, translating between them in real time. We believe whoever builds the software that unifies these and can accurately take action on the data will sit at the most valuable intersection in AI infrastructure.

5. Construction, maintenance, and labor

The construction and electrical engineering workforce that makes this entire buildout possible is experiencing massive labor shortages. As of late 2025, the construction industry faced a shortage of roughly 439,000 workers, coupled with an uptick in peak crew sizes that have grown from ~750 during the cloud era to 4,000 to 5,000 workers today. The average age of the data center workforce is 53, and 60% of data center providers report difficulty filling open roles. The result is a category where labor constraints are now as likely to delay a project as power or permitting. Robotics, automation, and AI-powered software are filling the gap across three distinct phases of the data center lifecycle: initial construction, ongoing operations, and end-of-life decommissioning.

Initial construction

The first physical act of every data center build is the most labor-intensive. One company in the space is Bedrock Robotics, which retrofits existing heavy machinery with autonomous technology through what they call the Bedrock Operator. Built Robotics has developed autonomous trenching and pile driving technology built for industrial sites, with a growing focus on data centers and solar farms. TerraFirma operates as a tech-enabled subcontractor, deploying robotics earthworks directly on active data center construction sites. Once construction is underway, DroneDeploy can capture aerial site data and use AI to monitor progress, flag safety risks, and verify build quality.

Ongoing operations and maintenance

At the scale of modern hyperscale campuses, unplanned downtime is costly. Gecko Robotics deploys AI-powered robots to inspect industrial infrastructure, detecting corrosion and structural issues before they cause failures. Watney Robotics is building autonomous robotic systems for inside-the-facility tasks spanning break/fix, logistics, and routine maintenance.

End-of-life: decommissioning and circularity

On the flip side of the AI buildout is the AI teardown. As hyperscalers refresh GPUs and servers on accelerated cycles to keep pace with successive hardware generations, the volume of decommissioned equipment is exploding. Molg is building robotic microfactories that autonomously disassemble decommissioned servers, laptops, and industrial electronics for component recovery, remanufacturing, and recycling.

What we look for: Companies that ease the labor crunch across the full data center lifecycle, from breaking ground to decommissioning the last server. The most defensible players will combine automation with proprietary visibility, which includes predicting failures before they cascade, surfacing build quality issues before they become rework, and recovering value from equipment that would otherwise be discarded.

6. Cooling technologies

AI workloads have driven rack densities to levels the industry has never seen before, with projections that they'll require 50x the power density of cloud-era racks within the next year. Traditional cooling methods have failed to keep up. Air cooling in particular dissipates heat too inefficiently to prevent throttling and hardware degradation. Hyperscalers and frontier labs are willing to redesign around alternative cooling technologies because overheating directly affects the useful life of hardware and performance per watt.

Liquid cooling has emerged as the dominant solution, now accounting for the majority of new AI data center cooling architectures. Within liquid cooling, two hardware approaches are most prevalent. Direct-to-chip cooling delivers coolant directly to the processor surface via cold plates and manifold distribution systems, enabling precise thermal transfer at the component level. Immersion cooling takes a more literal approach, submerging servers entirely in dielectric fluid to handle the most extreme rack densities. Of the two, direct-to-chip has more near-term applications, integrating more readily into existing data center architectures. Corintis has developed AI-designed microfluidic cold plates that route coolant directly to chip hotspots, outperforming the parallel-channel copper plates that have been the industry standard.

Beyond hardware, a second category is emerging where software orchestration and AI modeling optimize how cooling infrastructure operates. Phaidra improves thermal stability through AI agents using real-time power draw as an early-warning signal for demand spikes, enabling cooling systems to respond before chip temperatures rise and performance stalls.

What we look for: Cooling is becoming an active control system that determines chip performance, hardware lifetime, and increasingly, the water consumption and emissions footprint of an entire campus. We’re focused on companies that treat cooling as part of the performance stack rather than as infrastructure overhead. For software that turns thermal data into a real-time signal for capacity and uptime, and hardware that can adapt to evolving data center architectures rather than being locked in to a specific vendor.

The AI infrastructure evolution: where we go from here

The energy layer of AI infrastructure is an ecosystem of interconnected challenges. We believe each creates a distinct investment opportunity: power must be generated, intelligently dispatched, flexibly consumed, the grid must be modernized, and the utilities that operate it must be equipped with AI-native tools. The most attractive businesses in this cycle will capture outsized growth during this capex boom and emerge on the other side as baked-in operational infrastructure with durable, long-duration economics, even if and after new build slows.

We’re actively seeking founders building in this space across all of these opportunities. If you’re working on any aspect of the energy infrastructure stack for AI, we would love to hear from you. To get in contact, reach out to Lindsey Li (lli [at] bvp [dot] com), Brielee Lu (blu [at] bvp [dot] com), Josh Hechtman (jhechtman [at] bvp [dot] com).