Right now, chiplets are mostly found in high-margin processors, but that’s likely to change. The recently introduced ‘universal’ chiplet interconnect standard UCIE could fast-track the technology.
It seems that monolithic IC design of processors has had its day in the sun. AMD’s next generation of exascale-class GPU accelerators, the MI200 family, combines two full-fledged GPU dies. The flagship of the series, the MI250X, will be used alongside Epyc CPUs to power the world’s first exascale supercomputer at the Oak Ridge National Laboratory in the US.
Not to be outdone, Intel is dotting the i’s on the Ponte Vecchio processor, which combines compute, cache, networking and memory chiplets – tiles in Intel’s nomenclature – for a whopping total of 47 active and 16 thermal tiles inside a single package. Ponte Vecchio will co-power the 2+ exaflop Aurora supercomputer, to be deployed at the end of the year at the US Argonne National Laboratory.
In fact, you’d be hard-pressed to find a vendor in the high-performance compute arena that hasn’t launched or announced some sort of multichip design. Nvidia is entering the server CPU market with the Grace CPU Superchip, connecting two distinct CPUs. Apple ‘fused’ two in-house designed M1 Max processors to create the M1 Ultra, which powers the Apple Mac Studio.
Supercomputers, servers and workstations are high-margin products. But AMD, Intel and others have been working on less fancy chiplet-based chips for consumer markets, too. AMD introduced its first multi-die processors for PCs as early as 2017, though admittedly these were intended for enthusiasts willing to pay extra for a relatively modest real-life performance improvement. A highly unusual (and predictably brief) partnership between AMD and Intel delivered the Kaby Lake G line of mobile gaming processors in 2018, combining an Intel CPU with an AMD GPU.
As advanced packaging technologies mature and costs come down, chiplets are expected to increasingly become mainstream. The Universal Chiplet Interconnect Express (UCIE) standard, announced last month, may speed up that process.
Bang for the buck
After decades of efforts to increase integration, multiple factors are currently driving chipmakers’ move to multi-die designs. An obvious one is transistor count: by having two or more chips work intimately together, companies can take transistor count beyond what would be possible for a single chip. Such a monolith would become so large that yield issues are bound to act up, though that’s not stopping at least one company from taking monolithic design to the absolute extreme.
The multi-chip approach does incur a performance penalty since inter-chip communication can’t match intra-chip communication. A twin-chiplet chip, for example, doesn’t perform twice as well as the equivalent single die. Nevertheless, as the high-performance compute examples above illustrate, chiplet technology is capable of pushing the limit of computational power. Similarly, shortening communication lines by bringing more system components closer together reduces latencies and cuts down on overall power consumption.
This takes us to another driving force for chiplet adoption: the ability to mix and match dies. In a laptop processor, for example, the CPU cores and GPU benefit from leading-edge silicon, but on-board memory, Wi-Fi and I/O interfaces much less so – or not at all. It therefore often makes sense to realize these functions in different process nodes. As the cost of bleeding-edge silicon skyrockets, this approach can, depending on the design, significantly reduce overall manufacturing costs.
Mixing-and-matching can also refer to combining dies from multiple chipmakers. Assembling a design from ‘off-the-shelf’ solutions is in many cases more cost-effective than having an army of engineers spend years to tape out a monolithic design. Mixing-and-matching gives every chipmaker access to the highest performance, best-in-class or best bang-for-the-buck silicon. For example, Intel is having – at least for the time being – the compute tiles of its Ponte Vecchio processor manufactured at TSMC.
As the realization has sunk in that monolithic design isn’t always the best way to go anymore, chipmakers have gradually stepped up their chiplet activities over the past few years. Now that most have gotten their hands dirty with one or more product releases, a tipping point appears near. One issue preventing chiplets from attaining their maximum benefits is the fact that companies rely on different technologies to interconnect dies.
This is where UCIE comes in. UCIE is taking the well-established Peripheral Component Interconnect Express (PCIE) and Compute Express Link (which uses PCIE as the physical layer) standards inside the chip package. Just like PCIE helped electronics manufacturers hook up their graphics cards, solid-state drives and other components to PC motherboards, UCIE will provide chipmakers with an open and interoperable standard to combine silicon chiplets into a single package.
Two caveats could stand in the way of UCIE establishing the universal ecosystem for chiplets that the founding entities have in mind, though. While backed by a number of heavy-hitting companies, including AMD, Arm, Google Cloud, Intel, Meta, Microsoft, Qualcomm, Samsung and TSMC, not everybody is on board. Notable absentees include Amazon, Apple, Broadcom, IBM and memory vendors. Nvidia has expressed tentative interest in joining.
More importantly, it remains to be seen whether a single interconnect standard will ever be able to satisfy every need. Many considerations go into selecting the right method to interconnect dies. In some cases, speed is everything; in others, it’s latency or power. Time will tell how far UCIE can take the industry – or is it: how far the industry can take UCIE?