JUNE 18–22, 2017

Presentation Details

Name: From High-Performance Computing to Data-Driven Routes Towards New Insight into Materials Properties
Time: Tuesday, June 20, 2017
11:30 am - 12:00 pm
Room:   Panorama 2
Messe Frankfurt
Speaker:   Claudia Draxl, Humboldt University Berlin & Fritz Haber Institute of the Max Planck Society
Ab initio computational high-throughput studies are producing data with an exponential growth rate. Typically, only a small fraction of its content is finally published, while most of the results on the quantum-mechanical many-body problem are thrown away. Keeping the data may be considered an unnecessary Big-Data problem. However, it could also be considered a chance – the chance to learn about physical properties and processes.
How to exploit the wealth of information, inherently inside the materials data and extract unprecedented insight? On the one hand, new tools need to be developed for exploring similarities among materials and their properties, and finding out trends and anomalies. These tools comprise approaches of data-analytics, like machine-learning, compressed sensing, and alike. But there are other factors that need to be addressed for this new branch of materials research to be successful. This concerns issues of performance, portability, and accuracy [1,2] of computer codes, error bars, and comparability of data. 
The NOMAD Laboratory – a European Center of Excellence [3] addresses all these questions. It is a community-driven activity with the mission to serve the whole field of materials science and engineering. It tackles the issues of Big Data in materials science, starting from the NOMAD Repository [4], which promoting the idea of open access and sharing of materials data, by now contains results from more than 18 million DFT total-energy calculations. This corresponds to several billion CPU-core hours spent on high-performance computers worldwide. With these and more and more incoming data, we build a Materials Encyclopedia, to provide a user-friendly access to all these results, also making use of Advanced Graphics. Novel Big-Data Analytics tools [5,6] are developed for finding trends, identifying outliers, and predicting new materials with tailored properties.
This work received funding from the European Union’s Horizon 2020 research and innovation programme, grant agreement No 676580.
[1] K. Lejaeghere et al., Reproducibility in density-functional theory calculations of solids, Science 351, aad3000 (2016). 
[2] A. Gulans, A. Kozhevnikov, and C. Draxl, One-microHartree precision in density-functional-theory calculations, preprint.
[3] The NOMAD Laboratory, Center of Excellence for Computing Applications, funded within HORIZON 2020: https://nomad-coe.eu
[4] The NOMAD Repository: https://repository.nomad-coe.eu
[5] L. Ghiringhelli, J. Vybiral, S. V. Levchenko, C. Draxl, and M. Scheffler,  Big Data of Materials Science - Critical Role of the Descriptor, Phys. Rev. Lett. 114, 105503 (2015).
[6] L. M. Ghiringhelli, J. Vybiral, E. Ahmetcik, R. Ouyang, S. V. Levchenko, C. Draxl, and M. Scheffer, Learning physical descriptors for materials science by compressed sensing, New J. Phys., in print (2017).