CEPP

Center for Excellence in Parallel Programming

Unleash the performance of your HPC applications

For many years, the regular improvement of processors brought regular performance gain without any pain. Today, the increased number of compute cores, in CPUs and in accelerators or co-processors, requires a true optimization effort to get maximum performance.

Atos’s Center for Excellence in Parallel Programming (CEPP), operated in partnership with Intel and NVIDIA, helps you get optimal performance and maximum energy efficiency for your applications in the context of manycore technologies. The CEPP’s experts can advise you and help you analyze, optimize and port your codes. This includes for example:

  • Proof of Concepts (POCs) to demonstrate performance gains,
  • workshops that give you the opportunity to exchange with experts and get started with the porting, optimization and acceleration of your simulations,
  • application and solution benchmarks,
  • tailored training,
  • access to specific compute resources.
  • Fast Start program to ensure your applications make the most of your Bull supercomputers from day one (porting, optimizing and configuring your applications well ahead of system delivery).

CEPP in action: code optimisation for SKA-France

The SKA (Square Kilometer Array) will be the largest radio telescope ever built and will produce science that changes our understanding of the universe. The SKA will be collocated in Australia and in Africa. The project involves 100 organisations across about 20 countries. SKA-France is a national coordination of industrial, technical and scientific activities preparatory to the SKA project in France. SKA-France’s coordination work  to optimise new algorithms for radioastronomy through collaboration between researchers and HPC companies is producing interesting results. For the first experiments, Atos’s CEPP worked on the calibration and imaging code “DDFacet” by Cyril Tasse (OBSPM) with two different compiler suites: GNU and Intel. The aim was to dive into the DDFacet software stack in order to understand the different processing phases and extract their respective part from the total execution time. These experiments have highlighted potential improvements, which will be investigated through a tighter collaboration with developers.

SKA-France Monthly Bulletin (July 2017)

News on SKA-France website (in French)

CEPP in action: porting SEISCOPE to Intel Knights Landing

SEISCOPE is a consortium managed by the 3 French public laboratories LJK, Geoazur and ISTERRE, and sponsored by 9 Oil & Gas companies working together in quantitative seismic imaging, with the aim to reap the outcome of their common R&D endeavor to enhance their operations.

As part of SEISCOPE, an efficient 3D finite-difference time-domain modelling and frequency domain inversion code of Full Waveform Inversion called GeoInv3D is developed. GeoInv3D is memory bounded and is expected to benefit from Intel High Memory Bandwidth MCDRAM that equips the last generation of Intel® Xeon Phi™ Knights Landing (KNL) processors.

Atos’s Center for Excellence in Parallel Programming presents results obtained on an Intel® Xeon Phi™ processor (7210).

White paper

SEISCOPE Consortium website

CEPP in action: GPU accelerated implementation of NCI calculations using promolecular density

A scientific paper produced by Atos’s CEPP and the University of Reims Champagne-Ardenne, a long-time partner and customer of Atos in HPC.

The NCI approach is a modern tool to reveal chemical noncovalent interactions. It is particularly attractive to describe ligand–protein binding. A custom implementation for NCI using promolecular density is presented. It is designed to leverage the computational power of NVIDIA graphics processing unit (GPU) accelerators through the CUDA programming model. The code performances of three versions are examined on a test set of 144 systems. NCI calculations are particularly well suited to the GPU architecture, which reduces drastically the computational time. On a single compute node, the dual-GPU version leads to a 39-fold improvement for the biggest instance compared to the optimal OpenMP parallel run (C code, icc compiler) with 16 CPU cores. Energy consumption measurements carried out on both CPU and GPU NCI tests show that the GPU approach provides substantial energy savings.

Paper in Journal of Computational Chemistry

Our partners

Interested to hear more about our HPC services?