Project > Testbeds
Testbeds
The EuroEXA project worked toward a series of three centralised testbeds, installed at STFC Daresbury Laboratories in the UK. Each testbed offers increasing complexity and scale, demonstrating the technologies that the EuroEXA project group have under development.
Testbed 1 was built to enable software development for the networking and storage functions that run on the Xilinx ZU9 FPGA – with four mounted on each node – together with FPGA-software development within the Xilinx environment. At this stage, we also deployed a series of Development Testbeds at partner locations to aid with their own development work.
The original plan for Testbed 2 was for it to be a scaled-out version of Testbed 1, with additional nodes housing the same set of 4xZU9 chips. However, as part of the project’s co-design approach, the specification evolved – instead making it part of the development journey for Testbed 3. While this did cause a delay in its deployment, the resulting Testbed 2 offers more opportunities for demonstrating the potential for ExaScale computing, with each node including a VU9 Accelerator and a ZU9 network/storage processor.
Finally, the project culminates with Testbed 3. This will use the project’s key architectural compute-node elements of a Network/Storage Processor (ZU9 Xilinx), a fully reprogrammable Accelerator (VU9 Xilinx) and a powerful 64bit processor host. This will also include both air-cooled and liquid-cooled variants running side by side. While the air-cooled sections will only use commodity components, this approach will help us demonstrate and quantify the density and proximity optimisations that come with our liquid-cooled infrastructure.
Testbed 1: QFDB with FPGAs
Our first testbed includes eight interconnected Quad-FPGA Daughter Board (QFDB) nodes – each of which contains four Xilinx Zync ZU9 chips, with processors enhanced with FPGAs. This is then housed within a liquid cooling system designed by Iceotope, developed as part of the ExaNeSt project.
Total number of Nodes: 8
Processors per Node: 4x Xilinx ZU9 Quad A53 ARM+FPGA
Ram per Node: 32GB
Storage per Node: 480GB
Infrastructure: Iceotope Petagen
Status: Live, Installed Q4 2018
Testbed 2: Codesigned Scale-Out Testbed
EuroEXA is a co-designed project – an approach that ensures we develop our technologies around the needs of a range of applications. It’s this approach that has driven the evolution of Testbed 2 to become much more of a stepping stone on the journey toward Testbed 3. As such, it uses the much more powerful VU9 FPGA, which was shown to be up to four times faster than 4xZU9 FPGAs for key partner applications – offering both greater scalability and greater energy efficiency.
Testbed 2 Individual and Paired Development Nodes
Total number of Nodes: 12 (potentially increasing to 20)
Processors per Node: 1x Xilinx ZU9 Quad A53 ARM + FPGA; 1x Xilinx VU9 FPGA
RAM per Node: 64GB
Storage per Node: 480GB
Infrastructure: Iceotope Cold Plate
Status: Live @ Iceotope; Q3 2020 - Distributing to partners
Roles: Software Development; Runtimes Development; Firmware Development; Interconnect Development
Testbed 2 Liquid Cooled System
Total number of Nodes: 256
Processors per Node: 1x Xilinx ZU9 Quad A53 ARM + FPGA; 1x Xilinx VU9 FPGA
RAM per Node: 64GB
Storage per Node: 480GB
Infrastructure: Iceotope K:UL; Schneider Modular Data Centre Container
Status: Under Construction at Daresbury Labs, UK, ETA Q1 2021
Roles: Benchmarking; Real Operations
Testbed 3: High-performance host
The final testbed for the EuroEXA project is designed with bundles of nodes, putting a high-performance host together with an accelerator and a network/storage controller. It also includes three different nodes, offering the opportunity to showcase and contrast different air-cooled and liquid-cooled technological approaches.
Testbed 3 Air Cooled ARM node
Total Number of Nodes: 1
Processors per node: 1x ARM 64 Core; 1x Xilinx FPGA Accelerator; 1x Xilinx Storage/Network Processor/FPGA
RAM per Node: 64GB
Storage per Node: 480GB
Status: Under Construction at University of Manchester, UK; ETA Q4 2020
Roles: NEEDED
Testbed 3 Air Cooled Cluster
Total Number of Nodes: 32
Processors per node: 1x AMD EPYC; 1x Xilinx FPGA Accelerator; 1x Xilinx Storage/Network Processor/FPGA
RAM per Node: 64GB
Storage per Node: 480GB
Status: Under Construction at Daresbury Labs, UK; ETA Q4 2020
Roles: Software Development; Benchmarking; Real Operations
Notes: Made of modern COTS building blocks; 32 nodes taking up more space than 2u of TB3
Testbed 3 High Density, Proximity Optimised Liquid Cooled Cluster
Total Number of Nodes: 32
Processors per node: 1x AMD EPYC; 1x Xilinx VU9 FPGA Accelerator; 1x Xilinx ZU9 Storage/Network Processor/FPGA
RAM per Node: 96GB
Storage per Node: 480GB
Status: Under Construction at Daresbury Labs, UK; ETA Q2 2021
Roles: Software Development; Benchmarking; Real Operations
Notes: Physically Combined/Retrofitted to Testbed 2 to create a single compact system with hundreds of nodes in half a cabinet.