## Goals

The goal of the FEFF project is to perform quantitative calculations of electron-photon interactions in arbitrary materials, including dynamic response to time-dependent fields. The output includes optical constants, e.g., absorption, energy loss, etc., from the infrared to x-ray energies. The significance is multifaceted. Spectroscopy serves two complementary functions in the design and study of materials: 1) X-ray spectra provide structural information about a material and 2) its optical properties are often related to its desired function. Understanding local structure during chemical reactions is important in a wide-range of practical applications, from batteries to catalysts. The interaction of light with a material is important for systems like solar cells or photo-active molecules. Our research combines a variety of first-principles computational methods, and provides both interpretation of experimental measurements and a guide to material design. Optical constants, which heretofore have had to be measured or obtained from standard tables, are used in applications ranging from physics and materials science to geophysics, catalysis and biophysics. The software developed by our group is widely used in the analysis of synchrotron X-ray data.

One main use is the interpretation of spectra, e.g., from synchrotron x-ray and electron energy loss measurements. Going beyond conventional density functional theory (DFT), our approach is based on excited state electronic structure and uses real space Green's function (RSGF) techniques, the Bethe-Salpeter equation (BSE), and real-time time-dependent DFT (RT-TDDFT). Our research developments follow three main thrusts (technical notes A-C below). First principles simulations of photon and electron spectroscopies pose a significant challenge since they span a variety of phenomena over a wide range of wavelengths, and spatial and time scales. Given that no single method is adequate to encompass such varied regimes, integrated methods are needed to treat important many-body and dynamic structure effects. Our strategy for calculating these properties combines the advanced spectroscopy codes developed by our group, and modern electronic structure and molecular dynamics codes. Many of these are large scale calculations that require high-performance computing (HPC) facilities. The resulting hybrid codes yield accurate calculations of optical response from infrared to x-ray energies, for a wide variety of complex systems of current interest in materials, chemical, geological and biological sciences. Such systems range from water and aqueous solutions to complex materials like catalysts, nano-structures and ceramics with unusual structural properties.

## Thrust A: FEFFMPI, DFTMD2FEFF and auxiliary codes

RSGF calculations of optical, UV and x-ray spectra using MPI versions of the photon-spectroscopy FEFF codes developed by the Rehr group. The driving force behind this Thrust is FEFFMPI, a mature code which is under continous development. The DMDW/RTDW codes provide ab initio Debye-Waller factors through an interface to electronic structure codes such as GAUSSIAN, ABINIT, VASP, QUANTUM ESPRESSO, SIESTA and ORCA. This avoids the need for adjustable parameters or phenomenological models. In addition, standalone versions of DMDW/RTDW provide a variety of phonon properties of interest. Similarly, AI2PS provides electron-phonon coupling properties such as the phonon self energy. By combining FEFFMPI with real-time DFT (e.g with VASP or SIESTA) or model potential (3DMx) MD calculations, we can simulate systems with dynamical disorder. This approach is key to understanding the behavior of complex materials including nano-structures and aqueous solutions. These improvements are among the current efforts on x-ray spectroscopy theory. Extensions to other spectroscopies such as resonant inelastic x-ray scattering (RIXS) have been added recently.

Our current code base is centered around FEFFMPI v9 and JFEFF, its associated GUI. We have recently stabilized the self-consistent field loop and automated the calculation of atomic based dielectric functions. In addition, we have incorporated new physics, and added new analysis tools. Of particular interest are a finite temperature generalization which extends FEFFMPI to high temperatures, and the ability to print electron densities. As an example, we have investigated solid state effects on the density dependence of Compton spectra[arXiv:1308.2990, Submitted to Phys. Rev. Lett.]. Using DFTMD2FEFF, we have studied the structural and spectroscopic properties of nanomaterials, in particular, supported nanoparticles. We focused mainly on dynamical effects on supported PtSn nanoclusters[J. Phys. Chem. C 117, 12446 (2013)]. The DFT/MD approach tackles the non-equilibrium nature of the nanostructures in a realistic form, and provides a wealth of information about the clusters. A recent advance is the simulation of dynamical disorder effects on the catalytic properties. Based on simulations performed at NERSC, we have developed a new model for the understanding of nanoparticle structure dubbed "shake-rattle-and-roll", or SRR [Submitted to J. Chem. Phys.]. SRR helps in understanding the role of disorder on observed structural properties of nanoparticles and their catalytic activity. We greatly improved the AI2PS workflow comprising calculation of dynamical matrices using ABINIT, the DMDW module for vibrational properties, and various data handling steps. The entire process is now controlled by a single driver and runs from a single input file, thus fully relieving the user from complex data management. Most of the capabilities of the original DMDW module have now been embedded into AI2PS by eliminating the use of Octave scripts. Moreover, format and conventions now follow closely our other ABINIT-based developments, AI2NBSE and OCEAN. Small changes were submitted to the official ABINIT distribution so that it can now use any modern ABINIT distribution without any changes. DMDW was used to compute various vibrational properties, e. g. we computed DW factors that help explain the intensity oscillations in ionization cross sections[J. Chem. Phys. 138 (2013) 234310]. AI2PS is being beta-tested locally.

## Thrust B: AI2NBSE and OCEAN

The AI2NBSE/OCEAN packages for UV and x-ray spectroscopies combine the NIST BSE solver (NBSE) with the DFT electronic structure codes ABINIT or Quantum Espresso to compute excitonic effects and optical properties of periodic systems. OCEAN allows for the calculation of core spectra such as x-ray absorption spectroscopy (XAS), non-resonant inelastic x-ray scattering (NRIXS), and related phenomena. This approach is complementary to A above, and serves as a benchmark for first principles theoretical calculations of optical and core spectra. Two primary developments were achieved recently. First, the ABINIT and OCEAN versions of OCEAN were merged into a single code with a universal input file. Second, parallelization (by atom) was added to the screening and BSE stages of the calculation, resulting in considerable speed-up for multiple chemical environments. We performed NRIXS calculations on multiple high pressure phases of silicon and a manuscript containing these results is under review in J. Phys. Chem. We calculated XAS spectra of large water cells to verify structural models from MD simulations [in preparation].

## Thrust C: RT-SIESTA and RTXS

RT-SIESTA is based on a real time extension of SIESTA to compute linear and non-linear response, an approach formally equivalent to TDDFT. RTXS is an XAS extension to RT-SIESTA based on the auto-correlation function approach. RT-SIESTA and RTXS are complementary to A and B, and are appropriate for large organo-molecular systems where those methods are either too expensive or insufficiently accurate. These real-time approaches are uniquely suited for simulations of state-of-the-art pump-probe experiments. To improve efficiency and facilitate development, we have recently introduced the RTXS algorithms to GPAW. Recently, the RTXS code was tested for a variety of systems like a 48 carbon diamond cluster and a C60 fullerene, resulting in good agreement with experiment[Phys. Rev. B 86, 115107 (2012)]. Small deviations near the absorption edge were attributed to the core-hole description. To facilitate development of a better core-hole, we modified RTXS to use GPAW as the TDDFT engine to calculate x-ray spectra. This also resulted in a noticeable increase in multiprocessing scalability and code manageability. Finally, our efforts with RT-SIESTA focused mainly on performance testing using NERSC tools. Those tools allowed us to identify certain communications bottlenecks in the code which are currently been addressed.

## Code Descriptions

### 3DMx

Molecular dynamics and Monte Carlo simulations of supported metallic clusters using model potential. This code implements the velocity-Verlet algorithm together with a variety of thermostats, as well as a Metropolis Monte Carlo approach.

The code currently runs in serial mode only, but our calculations require the exploration of many initial conditions as well as system states (e.g. temperatures). Thus, we expect to run on the order of 200-300 simultaneous calculations. We need a large number of seria jobs to process a number of initial conditions and states.

Future development will focus on two main areas: i) Development of new code functionality such as parallel tempering Monte Carlo, to efficiently sample the complex configuration space of metallic clusters. ii) Interfacing of the code with the DMDW/RTDW and FEFF code, to be able to obtain structurally average x-ray spectra as well as efficient dynamical matrices for Debye-Waller factor calculations.

#### AI2NBSE/OCEAN

Excited state calculations of optical and core level spectra using extensions of the NIST GW-BSE code and ground state calculations using the plane-wave pseutopotential electronic structure codes ABINIT or QUANTUM ESPRESSO.

Given that AI2NBSE and OCEAN are hybrid codes, their parallel performance varies depending on the stage of the calculation and on the code used for the DFT calculations. Quantum Espresso provides better parallelization overall. The NBSE code is partially parallelized, while several of the interface steps are still serial and can act as bottlenecks. Large portions of this code are now well parallelized, but could possibly be improved through profiling and better management of the large (10-100GB) files generated.

In the future we will pursue three main areas of development: i) Improve the workflow management and installation process, in particular for the Quantum Espresso version of OCEAN. This will likely make the code easier to use by general users. ii) Implement support for ultrasoft pseudopotentials and PAW datasets. This will result in significantly more efficient calculations, an make possible the study of more complex systems. iii) Improve the code for more efficient parallelization and file I/O. Currently a good portion of the computation is used in both serial tasks and file I/O, but there is room for improvement.

### AI2PS

Calculation of phonon contributions to electronic structure and the phonon spectral function. Use the Eliashberg function calculated perturbatively by the pseudo-potential electronic structure code ABINIT.

The computational bottleneck is on the ABINIT runs. Independent runs can be run in parallel and each individual calculation scales well if the number of k-points is large. We can usually use up to 128-256 cores depending on the number of k-points in the calculation.

Our main goal for the AI2PS tool is to improve the user interface. This would include automating more input parameters, replacing some settings (e.g., file/directory naming) with defaults that can be overridden, and possibly adapting our other GUI development to AI2PS. In addition we are concurrently working to get it deployed on the cloud with our other cloud tools.

### DFTMD2FEFF

Calculations of configurationally averaged structural properties and XAS spectra with FEFF9MPI based on real-time DFT/MD calculations of dynamic structure using VASP, SIESTA or force-field models. These tools extend the capabilities of FEFF9MPI to investigate structural fluctuations in real-time, e.g., in complex systems such as supported catalysts, nano-structures and materials with unusual properties such as negative thermal expansion (NTE) ceramics.

The code is limited by the size of the system studied and the number of time steps. Our most recent simulations have used 100-300 atoms and 5000 time steps. In addition we explore several initial conditions and recently we have performed reaction path simulations, which compute several replicas of a system in parallel. This results in typical runs of order 1000-10000 cpu-hr. The computation of the spectral averages for the sampled snapshots is naturally parallel and can be done efficiently with FEFF9MPI. Our project could benefit from improvements to the scalability of VASP. We usually restrict jobs to 64 processors since VASP doesn't scale well for our systems beyond that. We foresee the computation of both larger systems (for which the scaling is better) and nudged elastic band (NEB) calculations, which are naturally parallel over beads.

In the future we will introduce a new line of development related to our new 3DMx code. As described in its section, 3DMx uses model potential to accelerate the computation of the molecular dynamics of our systems of interest. We will use trajectories produced with this code in the same manner that we use the DFT/MD trajectories in DFTMD2FEFF. This will enable us to study structural and spectroscopic effects at much longer timescales.

### DMDW/RTDW

Dynamical-matrix Debye-Waller (DMDW) and Real-Time Debye-Waller (RTDW) codes. Performs ab initio calculations of phonon-spectra and Debye-Waller factors based on: 1) A Lanczos algorithm and auxiliary calculations of dynamical matrices using ABINIT, Quantum Espresso, GAUSSIAN, SIESTA, Orca and VASP and, 2) The time-autocorrelation function obtained from VASP and SIESTA. The DMDW/RTDW results are needed to obtain ab initio EXAFS Debye-Waller factors for FEFF9MPI, and can be used to obtain other thermal properties like x-ray diffraction DW factors, vibrational free energies, thermal expansion coefficients, etc.

The number of cores used depends on the code used to generate the dynamical matrices or autocorrelation functions. For VASP the limit is usually 64 cores, although this depends on the size of the sytem. In the future we foresee the study of larger systems which would allow for the use of up to 1024 cores. For ABINIT we can usually use up to 128-256 cores depending on the number of k-points in the calculation. Current developments with Quantum Espresso provide a boost in efficiency over ABINIT. In the case of Gaussian, the core use greatly depends on the methodology. In our experience, ab initio methods usually scale poorly (a few tens of cores), while DFT methods perform better, with routine calculations using 64-128 cores. Performance limitations are associated with the underlying computation of dynamical matrices. We have recently developed an interfaces with Quantum Espresso and VASP that greatly increase the performance of the code and the scope of the calculations, allowing us to compute systems with hundreds of atoms per unit cell instead of tens.

We will further improve our interface with VASP and Quantum Espresso. In particular we want to take it to production level for large systems which were previously unattainable. This will take advantage of developments during last year which simplified the code workflow, making it more usable by general users. In addition, we will link developments in DMDW/RTDW with those in a new code (3DMx, see description) meant for efficient computation of metallic system (which encompass a large portion of our simulations). The interfacing of DMDW with the 3DMx code will allow us to efficiently explore the dynamical matrices of structurally disordered clusters.

### FEFF9MPI

Ab initio, real space Green's function calculations of excited state electronic structure and core-level x-ray and electron energy loss (EELS) spectra. This next generation code takes advantage of high performance computational facilities. Recent developments include corrections for correlated electrons based on the Hubbard model, Compton and resonant inelastic x-ray scattering (RIXS) modules, and a cumulant expansion method for including many-body effects in x-ray photo-electron spectra. In addition, the FEFF code includes parameter free many-body effects (self-energies, core-hole and Debye-Waller factors) and permits calculations of optical constants over a broad spectrum from the visible to hard x-rays using a new real-space spectral function approach for valence level response. This extension also permits calculations of improved self-energies based on calculations of valence excitation spectra, i.e., with a many-pole self-energy. Auxiliary codes (DMDW and DFTMD2FEFF) described separately are used for Debye-Waller factors and real-time simulations of structural fluctuations.

The most time-consuming part of the code parallelizes naturally, with each processor core being assigned to a small number of spectral points. Many FEFF9 codes can be run comfortably on a user's laptop. A cloud port is available for running the most demanding FEFF9MPI calculations.

### RTSIESTA/RTXS

RTSIESTA:Calculation of electron and nuclear dynamics and linear and non-linear optical response of molecular systems, based on the real-time time-dependent density functional formalism (RT-TDDFT). RTXS:Calculation of core spectroscopies (XAS and XES) using both Fermi's Golden Rule and efficient correlation function approaches.

RTSIESTA is currently limited to 4-9 cores due to internode communications. In its present form RTXS scales better (~100 cores) but future developments will likely suffer from the same issues as RTSIESTA. This issues are usually alleviated by the computation of several initial conditions in parallel.

In the case of RT-SIESTA, we will continue working on performance improvements. These improvements will be crucial for the next development which is to include nuclear motion via Ehrenfest dynamics. For RTXS we plan to test the new GPAW version at NERSC. In addition, we will extend our real time methodology to explore core-hole dynamics. This includes exploration of both cumulant and determinantal approaches including intrinsic and extrinsic effects. We would also like to extend GPAW to calculate L-edges, since it is currently limited to K-edges, and to multiple core-holes.

### FEFF-RIXS

### SC2IT and JSC2IT

### MEEP-GUI