CPUs are virtualized
GPUs are next

  1. 1964

    CP-40 virtualizes mainframes

    Early computers, large systems known as mainframes, cost millions of dollars. These systems were far too expensive for a single user, and so were shared across employees within an enterprise. Early time-sharing systems let many users share one operating system, but the approach quickly became impractical. IBM, one of the largest manufacturers of mainframes, therefore wanted to develop a way for each user to have their own operating system, running simultaneously on the mainframe. Robert Creasy led CP-40, a project designed to build such a system. CP, or "Control Program", divided one mainframe into many virtual computers. From the user's perspective, they had their own computer, while in the background the control program could decide how to efficiently share the hardware across all of the independent workloads. In turn, this changed the cost equation: more users on one mainframe made it much easier to justify the price, accelerating adoption.

  2. 1998

    VMWare virtualizes x86

    By this point, virtualization was standard for mainframes, but x86 systems were still allocated as dedicated machines. Virtualizing x86 remained an ongoing research question, one of the focuses of Mendel Rosenblum's Operating Systems group at Stanford. When several research prototypes showed promise, he and several researchers spun out VMware to bring them to market. At the time, x86 processors and existing operating systems did not natively support virtualization, so the team used creative techniques that did not require applications to be rewritten. As the software improved, x86 CPUs became more efficient and flexible, further expanding their adoption.

  3. 2006

    AWS uses virtualization at scale

    Despite this progress, using a server still meant purchasing and deploying physical hardware. Virtualization improved the efficiency of this hardware at a server level but each enterprise still bought and managed their own hardware. Consequently, internal teams at Amazon had spent substantial time, capital, and engineering effort learning how to optimize their servers. They used these learnings to create a service that managed infrastructure for their customers: AWS. By pooling this infrastructure, AWS could achieve massive economies of scale. For example, each enterprise had inefficiencies and unique usage patterns that, when pooled across many enterprises, created predictable demand. Within AWS's compute offering, EC2, CPU virtualization was a key enabler. Instead of requesting physical servers, customers could request instances with specified amounts of compute. Virtualization then allowed Amazon to allocate hardware flexibly across workloads. For users, the result was that they no longer had to think about hardware and instead benefited from lower pricing and improved reliability. In doing so, AWS proved that virtualization could enable massive economies of scale across fleets of shared servers.

  4. 2012

    AlexNet starts the transition to GPUs

    Everything discussed so far has been CPU-based, but meanwhile, a new type of chip was starting to gain traction. Until this point, GPUs were primarily used to accelerate graphics, with only niche applications in scientific computing. That changed with the AlexNet paper. AlexNet was a frontier AI model trained entirely on GPUs in a fraction of the time required by CPUs. Jensen Huang, CEO of Nvidia, the largest manufacturer of GPUs, recognized the implications of this discovery and pivoted the company hard toward AI. GPUs were no longer niche tools for accelerating graphics, they had become the critical compute resource for the next generation of AI. In turn, this began an arms race to deploy GPUs at enormous scale and to train ever-larger AI models, prioritizing capacity over all else. Yet despite this shift, GPUs are still allocated as physical machines, leaving room for the virtualized abstractions that enabled abundance for CPUs.

  5. 2022–present

    Thunder Compute virtualizes GPUs

    Thunder Compute is building that virtualization layer for GPUs. As with early x86 virtualization, the effort began with experimental systems research in 2022 and has since evolved into a company dedicated to bringing GPU virtualization to market. GPUs do not have hardware support for a flexible, general-purpose abstraction layer, so Thunder Compute uses existing abstractions in creative ways to provide one. The benefits are familiar; GPUs become schedulable resources independent of the physical chips beneath them. Over time, we believe most GPUs will be virtualized, just as most CPUs are today. The benefits are clear, and the need is unprecedented. Thunder Compute exists to bring this future to life.