One of the extra thrilling connectivity requirements over the previous 12 months has been CXL. Built upon a PCIe bodily basis, CXL is a connectivity customary designed to deal with rather more than what PCIe does – except for merely performing as an information switch from host to system, CXL has three branches to assist, often called IO, Cache, and Memory. As outlined within the CXL 1.Zero and 1.1 requirements, these three type the idea of a brand new solution to join a number with a tool. The new CXL 2.Zero customary takes it a step additional.
CXL 2.Zero continues to be constructed upon the identical PCIe 5.Zero bodily customary, which signifies that there aren’t any updates in bandwidth or latency, however provides some a lot wanted performance that clients are used to with PCIe. At the core of CXL 2.Zero are the identical CXL.io, CXL.cache and CXL.reminiscence intrinsics, coping with how knowledge is processed and in what context, however with added switching capabilities, added encryption, and assist for persistent reminiscence.
CXL 2.0 Switching
For customers who’re unfamiliar with PCIe switches, these hook up with a number processor with quite a few lanes, similar to eight lanes or sixteen lanes, after which assist many extra lanes downstream to extend the variety of supported gadgets. A typical PCIe swap for instance may join with 16x lanes to the CPU, however provide 48 PCIe lanes downstream to allow six GPUs related at x8 apiece. There is an upstream bottleneck, however for workloads that depend on GPU-to-GPU switch, particularly on methods with restricted CPU lanes, utilizing a swap is one of the simplest ways to go. CXL 2.Zero now permits the usual for switching.
Modern PCIe switches do extra than simply ‘add lanes’. Should one of many end-point gadgets fail (similar to an NVMe SSD), then the swap ensures the system can nonetheless run and disable that lane so it doesn’t have an effect on the remainder of the system. Current switches out there additionally assist switch-to-switch connectivity, permitting a system to scale out downstream gadgets.
One of the larger updates in current swap merchandise has been the assist for a number of upstream hosts, such that if a number fails, the downstream gadgets nonetheless have one other host to connect with. Combined with switch-to-switch connectivity, a system can have a collection of pooled hosts and pooled gadgets. Each system can work particularly with a number within the pool in a 1:1 relationship, or the gadgets can work with many hosts. The new customary, with CXL Switching Fabric APIs, allow as much as 16 hosts to make use of one downstream CXL system without delay. On high of this, Quality of Service (QoS) is part of the usual, and to be able to allow this the usual packet/FLIT unit of knowledge switch is unaltered, with among the unused bits from CXL 1.1 getting used (that is what further bits are used for!).
The one component not current in CXL 2.Zero is multi-layer swap topologies. At current the usual and API solely helps a flat layer. In our briefing, the consortium members (a few of whom already create multi-layer PCIe swap materials) acknowledged that that is the primary stage of enabling swap mechanics, and the roadmap will develop based mostly on buyer wants.
CXL 2.0 Persistent Memory
Another notch in enterprise computing in the previous few years is persistent reminiscence – one thing nearly as quick as DRAM however shops knowledge like NAND. There has all the time been a query of the place such reminiscence would sit: both as small quick storage by way of a storage-like interface, or as sluggish high-capacity reminiscence by way of a DRAM interface. The preliminary CXL requirements didn’t immediately assist persistent reminiscence, except it already had a tool hooked up to it, within the CXL.reminiscence customary. This time nevertheless, CXL 2.Zero permits distinct PMEM assist as a part of a collection of pooled assets.
The APIs enabling software program to cope with PMEM assist are constructed into the specification,…