BERKELEY -- A team of scientists from two national laboratories
reached a supercomputing milestone this weekend, getting their simulation of metallic
magnetism to run at 1.02 Teraflops -- more than one trillion calculations per second.
The achievement, reached using a 1,480-processor Cray T3E supercomputer at the
manufacturers facility in Minnesota, caps an already remarkable scaling up of the
code to run on increasingly powerful massively parallel supercomputers. Over the summer,
the team of scientists at Oak Ridge National Laboratory working with the National Energy
Research Scientific Computing Center (NERSC) at the Lawrence Berkeley National Laboratory
performed a 1,024-atom first-principles simulation of metallic magnetism in iron which ran
at 657 Gigaflops (billions of calculations per second) on a 1024-processor Cray/SGI T3E
supercomputer.
This success made them finalists for the Gordon Bell Prize, awarded annually to honor
the best achievement in high-performance computing. The team, which also includes
collaborators at the Pittsburgh Supercomputing Center and the University of Bristol (UK),
are finalists for the prize for their parallel computer simulation of metallic magnetism.
Funded as one of the U.S. Department of Energys Grand Challenges, the group
developed the computer code to provide a better microscopic understanding of metallic
magnetism, which has applications in fields ranging from computer data storage to power
generation and utilization.
Given annually at SC98, the annual conference of high-performance computing and
networking, the Gordon Bell Prize recognizes the best accomplishment in high-performance
computing. The Oak Ridge-NERSC group was nominated in the category for highest computer
speed using a real-world application. The winner of this years prize will be
announced during the conference on Thursday, Nov. 12, in Orlando, Fla.
Although parallel supercomputers are the worlds fastest computers -- capable of
performing hundreds of billions of calculations per second -- realizing their potential
often requires writing complex computer codes as well as reformulating the scientific
approach to problems so that the codes scale up efficiently on these types of machines.
In developing this code for parallel computers the researchers were forced to rethink
their formulation of the basic physical phenomena. The code was originally developed with
Intel Paragon machines at ORNLs Center for Computational Science (CCS) in mind and
has exhibited linear scale up to 1024-processors on an Intel XPS-150.
"One of the goals of this project is to address critical materials problems on the
microstructural scale to better understand the properties of real materials. A major focus
of our research is to establish the relationship between technical magnetic properties and
microstructure based on fundamental physical principles," said Malcolm Stocks, a
scientist in Oak Ridges Metals and Ceramics Division and leader of the project.
"The capability to design magnetic materials with specific and well-defined
properties is an essential component of the nations technological future."
In May and June of this year, the research team ran successively larger calculations on
a series of bigger and more powerful Cray supercomputers. After the simulation code
attained a speed of 276 Gflops on the Cray T3E-900 512-processor supercomputer at NERSC,
the group arranged for use of an even faster T3E-1200 at Cray Research Inc. and achieved
329 Gflops. They were then given dedicated time on a T3E600 1024-processor machine at the
NASA Goddard Space Flight Center which allowed them to perform crucial code development
work and testing before the final run at 657 Gflops on a T3E1200 1024-processor machine at
a U.S. government site.
"These increases in the performance levels demonstrate both the power and the
capabilities of parallel computers -- a code can be scaled up so that it not only runs
faster but allows us to study larger systems and new phenomena that cannot be studied on
smaller machines," said Andrew Canning, a physicist in NERSCs Scientific
Computing Group who worked with the Oak Ridge team on this project.
The Gordon Bell Award work was part of a larger Department of Energy Grand Challenge
Project on Materials, Methods, Microstructure and Magnetism between ORNL, Ames Laboratory
(Iowa), Brookhaven National Laboratory, NERSC and the Center for Computational Science and
the Computer Science and Mathematics Divisions at ORNL.
"As the Department of Energys national facility for computational science,
we see this achievement by the Grand Challenge team as a major breakthrough in
high-performance computing," said NERSC Division Director Horst Simon. "Unlike
other recently published records, this is a real application running on an operational
production machine and delivering real scientific results. NERSC is proud to have been a
partner in this effort."
NERSC provides high performance computing services
to DOEs Energy Research programs at national laboratories, universities, and
industry. Berkeley Lab conducts unclassified research and
is managed by the University of California.
SCIENTIFIC BACKGROUND
Developing a microscopic understanding of metallic magnets has proven to be an abiding
scientific challenge. This originates in the itinerant nature of the electrons that give
rise to the magnetic moment, which are the same electrons that give rise to metallic
cohesion (bonding). It is this dual behavior of the electrons precludes the use of simple
(Heisenberg) models.
The performance runs were performed during the development of a new theory of
non-equilibrium states in magnets. The new constrained local moment (CLM) theory places a
recent proposal for first principles Spin Dynamics (SD) from a group at Ames Laboratory on
firm theoretical foundations. In SD non-equilibrium local moments (for
example, in magnets above the Curie temperature, or in the presence of an external field),
evolve from one time step to the next according to a classical equation of motion. As
originally formulated there were fundamental problems with SD. This stems from the fact
that the instantaneous magnetization states that are being evolved were not properly
defined within Local Spin Density Approximation to the Density Functional Theory (LSDA),
the framework of most modern quantum simulations of materials. (Interestingly, this
years Nobel prize in Chemistry was awarded to Professor Walter Kohn for originating
Density Functional Theory).
The CLM theory properly formulates SD within constrained density functional theory.
Local constraining fields are introduced, the purpose of which is to force the local
moments to point in directions required at a particular time step of SD. A general
algorithm for finding the constraining fields has been developed. The existence of CLM
states has been demonstrated by performing calculations for large (up to 1024 atom) unit
cell disordered local moment models of Iron above its Curie temperature. In this model the
magnetic moments associated with individual Fe atoms are constrained to point in a set of
orientations that are chosen using a random number generator. This state can be thought of
as being prototypical of the state of magnetic order at a particular step in a finite
temperature SD simulation of paramagnetic Fe.
These calculations represent significant progress towards the goal of full
implementation of SD and a first principles theory of the finite temperature and
non-equilibrium properties of magnetic materials.
The work was performed by: Balazs Ujfalussy, Xindong Wang, Xiaoguang Zhang, Donald M.
C. Nicholson, William A. Shelton and G. Malcolm Stocks, Oak Ridge National Laboratory;
Andrew Canning, NERSC, Lawrence Berkeley National Laboratory; Yang Wang, Pittsburgh
Supercomputing Center; and B. L. Gyorffy, H. H. Wills Physics Laboratory, UK.