Trillium TPU

Google Cloud introduces Trillium TPU

At Google I/O 2024 this past week, the Google Cloud introduced its sixth-generation Tensor Processing Unit (TPU), Trillium. This custom-designed silicon is manufactured at TSMC and designed to run AI-specific workloads.

According to Google, Trillium delivers a 4.7 times increase in peak compute performance per chip compared to its predecessor, the TrilliumTPU v5e. This leap in performance is attributed to the doubled High Bandwidth Memory (HBM) bandwidth and capacity on Trillium, along with the integration of Google’s third-generation SparseCore technology. SparseCore is a specialized accelerator designed for processing recommendations and complex ranking workloads, contributing to Trillium’s impressive energy efficiency gains of over 67% compared to the previous generation.

Trillium is a crucial component of Google’s AI Hypercomputer, a powerful infrastructure for handling complex AI workloads in the cloud. This Hypercomputer combines TPUs, GPUs, storage systems, custom networking, and open-source frameworks to deliver the necessary performance for demanding AI applications. Importantly, Trillium maintains backward compatibility with models designed for earlier TPU generations, ensuring a smooth transition for developers. This advancement addresses two major challenges in the AI landscape: managing the escalating costs of scaling AI workloads and minimizing the environmental impact by reducing power consumption. While the release date for Trillium-powered instances remains unspecified, Google says that they will be available later in 2024.

The launch of Trillium is part of a growing shift away from reliance on off-the-shelf solutions towards the development of custom silicon by Google, Meta, AWS. By designing their own chips, these three companies gain the ability to optimize performance for specific workloads while minimizing power consumption. Control over the entire hardware stack also allows the companies to implement new features and optimizations at their own pace rather than being constrained by the traditional release cycles of merchant silicon vendors such as Nvidia, Intel, and AMD.

Scroll to Top