Mont-Blanc project: preparing for next generation supercomputing through ARM chips

Low-power ARM chips dominate the mobile world of smartphones, tablets, and embedded IoT devices, here, Mont-blanc investigates how they could power supercomputers

With data centres consuming ever more power, the idea of using highly energy-efficient ARM chips in servers is enticing, especially for energy-hungry High-Performance Computing (HPC) configurations. As early as 2011, several pioneering European companies and institutions recognised the tremendous potential offered by embedded processor technology and decided to unite into the Mont-Blanc project to investigate the usage of low-power ARM processors for HPC.

However, making the leap from the mobile market to HPC was not trivial: HPC-optimised libraries, compilers and applications did not exist for ARM platforms. Mont-Blanc partners had to start from scratch building ARM HPC test systems based on 32-bit mobile phone technology, and porting and tuning software and tools to create an ARM software ecosystem. In 2015, Mont-Blanc deployed the world’s first ARM-based HPC cluster, featuring over 2,000 mobile CPUs. This system helped demonstrate the viability of using ARM technology for HPC.

Six years on, the landscape has changed dramatically. ARM has introduced its first 64-bit architecture – ARMv8. The Mont-Blanc team put a lot of effort into extending and consolidating the ecosystem developed under the first phase of the project: scientific libraries and runtime systems were ported to ARMv8, and a set of development tools was developed for debugging, performance analysis, performance prediction, and automated kernel optimisation.

Interest for ARM processors is rising rapidly in the HPC community, as demonstrated for example by the success of the GoingARM workshop at the ISC conference in June 2017, or by the announcement of the world’s first large-scale production ARMv8 machine ‘Isambard’, made in January 2017 at a Mont-Blanc conference by the UK’s GW4 Alliance.

European computing

ARM chips
Mont-Blanc team

Besides purely technological considerations, ARM processors are increasingly viewed as a major asset for Europe’s self-determination in HPC, not only by the European Commission but also by many leading HPC organisations in Europe.

In this favourable context that it contributed to creating, the Mont-Blanc project, now in its third phase, is moving ahead. It leverages the findings of the previous project phases to imagine a new high-end HPC platform that will be able to deliver a high performance/energy ratio whilst executing real applications.

More precisely, the first technical objective of the project is to create a well-balanced architecture and deliver the design for an ARM-based SoC or SoP (System on Package) capable of providing pre-exascale performance – and measured using real HPC applications. The second objective is to maximise the benefit of this new architecture for HPC applications with new high-performance ARM processors and throughput- oriented compute accelerators designed to work together.

Finally, the third objective is to develop the necessary software ecosystem for the future SoC – a fundamental asset to maximise the project impact and ensure real-life success for this ARM architecture.

For example, one of the issues we are investigating is the need to transform applications from being latency limited to being throughput limited. This was an essential finding of the previous phases of Mont-Blanc. In the same way kids throw a tantrum to obtain immediately something they desperately need; our programmes issue a request for a resource and stall until whatever they require is available.

Various costly techniques are implemented to achieve some overlap between computation and communication, but our belief is that much more aggressive levels of look-ahead in work/resource demand generation and less urgent synchronisation demands can be achieved, by resorting to an asynchronous task-based programming model such as OpenMP4.0 or OmpSs. This transformation from latency-limited (by the response time of individual resource requests) to throughput-limited (by the total amount of resources available) is a key enabler for the future, not only in HPC but also for general purpose computing.

One of the first outcomes of the Mont-Blanc 3 project is a new prototype based on 64-bit ThunderX2 processors from Cavium®, relying on the ARM® v8 instruction set. The system is now live at the Atos R&D centre in Les Clayes near Paris and leverages Atos’ Bull Sequana infrastructure, such as cluster management, network, power supply, and cooling. It was christened Dibona, after the Dibona peak in the French Alps, and the full configuration will ultimately include 48 computes nodes, i.e. 96 Cavium® ThunderX2 CPUs, or 3000 cores.

Dibona is not the end-product of the Mont-Blanc project, but it is a key tool that will allow project partners to expand their research, validate Mont-Blanc performance models, and test the completeness and usability of Mont-Blanc’s solution.

The really exciting news about this prototype is that it will not remain a prototype: Atos has decided to productise it and commercialise it as a standard Bull product under the name Bull Sequana X1310. All project partners are very proud that their work led to an actual product within the timeframe of the project, and will continue to work hard to design software and hardware that directly benefits HPC end-users.

The Mont-Blanc 3 cooperative R&D project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 671697. The project partners are Bull (Atos group, coordinator), ARM, AVL, BSC, CNRS, ETH Zürich, HLRS, Universität Graz, Universidad de Cantabria, UVSQ.

Please note: this is a commercial profile

 

Etienne Walter

Mont-Blanc 3 coordinator

Atos

etienne.walter@atos.net

montblanc-project.eu

www.twitter.com/MontBlanc_Eu

LEAVE A REPLY

Please enter your comment!
Please enter your name here