Framework Resources Multiply Computing Power
Originating Technology/NASA Contribution
For the last 25 years, the NASA Advanced Supercomputing (NAS) Division at Ames Research Center has provided extremely fast supercomputing resources, not only for NASA missions, but for scientific discoveries made outside of NASA as well. The computing environment at NAS includes four powerful high-performance computer systems: Pleiades, Columbia, Schirra, and RTJones. The collective capability of these supercomputers is immense, and in 2010, Pleiades was rated as the sixth most powerful computer in the world, based on a measure of the computer’s rate of execution.
With loads of computing power—plus some to spare—NASA was an early proponent of something called grid computing, which enables the sharing of computing power, databases, and other online tools across geographic locations. The Information Power Grid was NASA’s first distributed, heterogeneous computing infrastructure, linking computers at Ames, Langley Research Center, and Glenn Research Center into a connected grid.
Partnership
To demonstrate a virtual computer environment that links geographically dispersed computer systems over the Internet to help solve large computational problems, 3DGeo Development Inc., of Santa Clara, California, received Small Business Innovation Research (SBIR) funding from Ames in 2002.
3DGeo’s work with Ames started with its Internet Seismic Processing (INSP) product. As a provider of advanced imaging software services for the oil and gas industry, 3DGeo developed INSP to provide an interface between clients and computer servers to manage advanced seismic imaging projects. Used to identify underground deposits of oil and gas, the seismic imaging process first sends sound waves to an area to create an echo that is recorded by an instrument called a geophone. The data from the geophone is then processed to create images of the subsurface.
At the time of the SBIRs, an existing key function of INSP was to facilitate the communication between members of an exploration team, providing the tools to easily share information, regardless of physical distance. To add to INSP’s capability and to address the need for computational resources in excess of those locally available, 3DGeo developed a graphical user interface (GUI) for the detection of available remote resources for workflow design and execution, as well as for continuous monitoring of the resources being used. New features were built into INSP to extend its functionality for grid computing, and the technology was named “grid-enabled INSP,” or G-INSP. The conversion of INSP to a grid-enabled system provided access to the resources needed to run advanced imaging applications whenever and wherever they were needed.
In 2008, 3DGeo merged with Fusion Geophysical LLC to become FusionGeo Inc., of The Woodlands, Texas.
Product Outcome
The NASA-derived G-INSP software is commercially available as a product called Accelerated Imaging and Modeling (AIM), which serves as a collaboration tool toward a fully optimized grid environment and provides a framework for integration at different stages of data processing. The software can create a virtual computer environment for any calculation-intensive and data- intensive application and is available as stand-alone software or as a component in other FusionGeo products.
Through its GUI, AIM offers functionality for searching for remote grid-resources; seeing and exploring file systems using grid file transfer protocol (FTP); visualizing data; dragging and dropping data sets between grid FTP file systems; creating and executing workflows on the grid; and overseeing workflows and checking intermediate results.
“Our product is an infrastructure for parallel computing, or high-performance computing, over the Internet. If you have a computationally intensive task, which requires thousands of central processing units (CPUs) of computing power, but you don’t have that many CPUs—or don’t have them locally—you can submit your job over the Internet and access computing power, storage, and data that are distributed at centers anywhere in the world,” describes Dimitri Bevc, the chief technology officer at FusionGeo.
Because of the industry’s requirements for very large processing and data storage capacities, AIM is particularly well-fit for seismic imaging for energy exploration and development. Based on data garnered through the seismic imaging process, the construction of accurate 3-D images of the subsurface is an extremely resource-intensive task. According to Bevc, it requires the handling of data on the order of 10 to15 terabytes for a single marine 3-D survey. In addition, processing the data involves thousands of processors using computationally demanding algorithms. Only large-scale parallel computers can calculate these algorithms and deliver results within a useful time period.
Bevc explains that, without grid computing, a large seismic imaging project for the oil and gas exploration industry can take 1 to 2 years to complete. First, data tapes are transported physically between the data acquisition site, data banks, data processing sites, and quality control and interpretation sites, usually by the U.S. Postal Service, FedEx, or courier services. Once the data is interpreted, it is often reprocessed with a new set of parameters and then repeated. Ultimately, the process culminates with a decision to drill or not to drill a well.
AIM outfits oil and service companies with a grid-enabled virtual computer for larger 3-D seismic imaging projects, and provides computational resources on an as-needed, on-demand basis. “This can significantly shorten the time between receiving seismic data and making a drilling decision, and offers an ideal use of the grid to enhance the real-time value chain for end users in oil and gas exploration. Oil company experts, contractors, and service companies can collaborate between geographically disparate divisions and resources,” says Bevc.
While AIM is currently used by oil companies and seismic service companies, the technology is also applicable to other computation- and data-demanding scientific applications such as geothermal energy exploration and for processes like carbon dioxide sequestration monitoring.