[ad_1]
In short: GPUs have reminiscence limitations when going through the calls for of AI and HPC functions. There are methods round this bottleneck, however the options might be costly and cumbersome. Now, a startup headquartered in Daejeon, South Korea, has developed a brand new method: utilizing PCIe-attached reminiscence to increase capability. Creating this resolution required leaping by way of many tech hoops and there are nonetheless challenges forward. Specifically, will AMD, Intel, and Nvidia help the expertise?
Reminiscence necessities stemming from superior datasets for AI and HPC functions typically swamp the reminiscence constructed right into a GPU. Increasing that reminiscence has usually meant putting in costly excessive bandwidth reminiscence, which regularly introduces modifications to the prevailing GPU structure or software program.
One resolution to this bottleneck is being supplied by Panmnesia, an organization backed by South Korea’s KAIST analysis institute, which has launched new tech that permits GPUs to entry system reminiscence instantly by way of a Compute Categorical Hyperlink (CXL) interface. Primarily, it permits GPUs to make use of system reminiscence as an extension of their very own reminiscence.
Known as CXL GPU Picture, this PCIe-attached reminiscence has a double-digit nanosecond latency that’s considerably quicker than conventional SSDs, the corporate says.
Panmnesia needed to overcome a number of tech challenges to develop this technique.
CXL is a protocol that works on prime of a PCIe hyperlink, however the expertise must be acknowledged by an ASIC and its subsystem. In different phrases, one can’t merely add a CXL controller to the tech stack as there isn’t a CXL logic cloth and subsystems that help DRAM and/or SSD endpoints in GPUs.
Additionally, GPU cache and reminiscence subsystems don’t acknowledge any expansions besides unified digital reminiscence (UVM), which isn’t quick sufficient for AI or HPC. In exams by Panmnesia, UVM carried out the worst amongst all examined GPU kernels. The CXL, nonetheless, offered direct entry to expanded storage by way of load/retailer directions, eliminating the problems hampering UVM comparable to overhead from host runtime intervention throughout web page faults and transferring information on the web page degree.
What Panmnesia developed in response is a sequence of {hardware} layers that help the entire key CXL protocols, consolidating them right into a unified controller.
The CXL 3.1-compliant root advanced has a number of root ports supporting exterior reminiscence over PCIe and a bunch bridge with a host-managed gadget reminiscence decoder that connects to the GPU’s system bus and manages the system reminiscence.
There are different challenges that Panmnesia is going through that transcend its management, a giant one being that AMD and Nvidia should add CXL help to their GPUs. It’s doable that trade gamers resolve they just like the method of utilizing PCIe-attached reminiscence for GPUs – and go on to develop their very own expertise.
[ad_2]