Coding ‘Sprinters’ in Hackathon Marathon

Thursday, July 6, 2017 @ 03:07 PM gHale

Each of the teams participating in the hackathon sat at a large round table to facilitate collaboration.
Photo by Brookhaven National Laboratory

Coding “sprinters” took their marks at the U.S. Department of Energy’s (DoE) Brookhaven National Laboratory, starting the first of five days of nonstop programming from early morning until night.

During this coding marathon, or “hackathon,” these teams of computational, theoretical, and domain scientists, software developers, and graduate and postdoctoral students learned how to program their scientific applications on devices for accelerated computing called graphics processing units (GPUs).

Plugging a Critical Infrastructure Hole
Pushing for a Cyber Secure Car
App Can Protect Against Voice Hacking
Random Security at Quantum Level

Guiding them toward the finish line were GPU programming experts from national labs, universities, and technology companies who donated their time to serve as mentors.

The goal by the end of the week was for the teams new to GPU programming to leave with their applications running on GPUs — or at least with the knowledge of how to do so — and for the teams who had come with their applications already accelerated on GPUs to leave with an optimized version.

GPU Computing Era
GPU-accelerated computing, which is the combined use of GPUs and central processing units (CPUs), is increasingly being used as a way to run applications much faster. Computationally intensive portions of an application are offloaded from the CPU, which consists of a few cores optimized for serial processing (tasks execute one at a time in sequential order), to the GPU, which contains thousands of smaller, more efficient cores optimized for parallel processing (multiple tasks are processed simultaneously).

However, while GPUs potentially offer a very high memory bandwidth (rate at which data can be stored in and read from memory by a processor) and arithmetic performance for a wide range of applications, they are currently difficult to program. One of the challenges is developers cannot simply take the existing code that runs on a CPU and have it automatically run on a GPU; they need to rewrite or adapt portions of the code. Another challenge is efficiently getting data onto the GPUs in the first place, as data transfer between the CPU and GPU can be slow. Though parallel programming standards such as OpenACCand GPU advances such as hardware and software for managing data transfer make these processes easier, GPU-accelerated computing is still a relatively new concept.

That is where “Brookathon” comes into play. The event, which started June 5, ended up hosted by Brookhaven Lab’s Computational Science Initiative (CSI) and jointly organized with DoE’s Oak Ridge National Laboratory, Stony Brook University, and the University of Delaware, came in.

“The architecture of GPUs, which were originally designed to display graphics in video games, is quite different from that of CPUs,” said CSI computational scientist Meifeng Lin, who coordinated Brookathon with the help of an organizing committee and was a member of one of the teams participating in the event. “People are not used to programming GPUs as much as CPUs. The goal of hackathons like Brookathon is to lessen the learning curve, enabling the use of GPUs on next-generation high-performance-computing (HPC) systems for scientific applications.”

Team Lineup
Twenty-two applications ended up submitted for a spot at Brookathon, half of which came from Brookhaven Lab or nearby Stony Brook University teams. Brookathon received the highest number of applications of any of the hackathons to date, Lin said. Ultimately, a review committee of OpenACC members accepted applications from 10 teams, each of which brought a different application to accelerate on GPUs:
• Team AstroGPU from Stony Brook University: Codes for simulating astrophysical fluid flows
• Team Grid Makers from Brookhaven, Fermilab, Boston University, and the University of Utah (Lin’s team): A multigrid solver for linear equations and a general data-parallel library (called Grid), both related to application development for lattice QCD under DOE’s Exascale Computing Project
• Team HackDpotato from Stony Brook University: A genetic algorithm for protein simulation
• Team Lightning Speed OCT (for optical coherence tomography) from Lehigh University: A program for real-time image processing and three-dimensional image display of biological tissues
• Team MUSIC (for MUScl for Ion Collision) from Brookhaven and Stony Brook University: A code for simulating the evolution of the quark-gluon plasma produced at Brookhaven’s Relativistic Heavy Ion Collider (RHIC) — a DoE Office of Science User Facility
• Team NEK/CEED from DoE’s Argonne National Laboratory, the University of Minnesota, and the University of Illinois Urbana-Champaign: Fluid dynamics and electromagnetic codes (Nek5000 and NekCEM, respectively) for modeling small modular reactors (SMR) and graphene-based surface materials — related to two DoE Exascale Computing Projects, Center for Efficient Exascale Discretizations (CEED) and ExaSM
• Team Stars from the STAR from Brookhaven, Central China Normal University, and Shanghai Institute of Applied Physics: An online cluster-finding algorithm for the energy-deposition clusters measured at Brookhaven’s Solenoidal Tracker at RHIC (STAR) detector, which searches for signatures of the quark-gluon plasma
• Team The Fastest Trigger of the East from the UK’s Rutherford Appleton Laboratory, Lancaster University, and Queen Mary University of London: Software that reads out data in real time from 40,000 photosensors that collect light generated by neutrino particles, discards the useless majority of the data, and sends the useful bits to be written to disk for future analysis; the software will be used in a particle physics experiment in Japan (Hyper-Kamiokande)
• Team UD-AccSequencer from the University of Delaware: A code for an existing next-generation-sequencing tool for aligning thousands of DNA sequences (BarraCUDA)
• Team Uduh from the University of Delaware and the University of Houston: A code for molecular dynamics simulations, which scientists use to study the interactions between molecules

“The domain scientists—not necessarily computer science programmers—who come together for five days to migrate their scientific codes to GPUs are very excited to be here,” said For co-creator, organizer, and UD professor Sunita Chandrasekaran. “From running into compiler and runtime errors during programming and reaching out to compiler developers for help to participating in daily scrum sessions to provide progress updates, the teams really have a hands-on experience in which they can accomplish a lot in a short amount of time.”  

Let the Games Begin

Each team had at least three members and worked on porting their applications to GPUs for the first time or optimizing applications already running on GPUs. As is the case in all of the hackathons, participants did not need to have prior GPU programming experience to attend the event. Two mentors were assigned to each team in the weeks preceding the hackathon to help the participants prepare. In addition to Brookhaven, mentors represented Cornell University; DoE’s Los Alamos, Sandia, and Oak Ridge national laboratories; Mentor Graphics Corporation; NVIDIA Corporation (also the top sponsor of the event); the Swiss National Supercomputing Centre; the University of Delaware; the University of Illinois; and the University of Tennessee, Knoxville.

“You meet GPU experts at conferences but here you sit with them for a whole week as they share their expertise in a hands-on setting,” said Lin. “Because GPU computing is still fairly new to Brookhaven, we did not have a lot of local experts that could serve as mentors. We were fortunate to have Fernanda and Sunita help recruit such a great group of mentors.”

Many of the mentors who volunteered for Brookathon have developed GPU-capable compilers (computer programs that transform source code written in one programming language into instructions that computer processors can understand) and have helped define programming standards for HPC.

Degree of Difficulty
Yet they too can appreciate the difficulty in programming scientific applications on GPUs, as mentor Kyle Friedline, a research assistant in Chandrasekaran’s Computational Research and Programming Lab at the University of Delaware, said, “My team’s code is really tough because of its large size and complex data structures that result in memory allocation problems.”

While most of the teams had prior experience in GPU programming, a few had to start with the basics. Especially for those novice teams, mentorship was key.

“All of our group members were new to GPU programming,” said MUSIC team member Chun Shen, a research associate in Brookhaven’s Nuclear Theory Group. “Our code was originally written in the C++ programming language with a rather complex class structure. We found that it was very hard to port the complex data structures to GPU with OpenACC, and the compiler did not provide us with useful error messages. Only with the support of our direct mentors and through fruitful discussions with other teams’ mentors were we able to simplify our code structure and successfully port our code to GPU within such a short amount of time.”

At the end of each day, team representatives gave presentations to the entire group so anyone could chime in to offer advice, as many teams shared common challenges. On the last day, the teams gave final presentations describing their accomplishments over the week, lessons learned along the way, and plans going forward.

“The teams worked really hard with their mentors and accomplished a lot in five days,” said Lin. “By the end of the week, all 10 teams had their codes running on GPUs and eight of them achieved code speedups, as much as 150-fold, over the original codes. Even the mentors felt that they learned something, and some already expressed interest in serving again at future hackathons.”

Leave a Reply

You must be logged in to post a comment.