| Sign In to gain access to subscriptions and/or personal tools. |
DOI: 10.1177/1094342006067471 Supporting Scalable and Distributed Data Subsetting and Aggregation in Large-Scale Seismic Data Analysis
DEPARTMENT OF BIOMEDICAL INFORMATICS, THE OHIO STATE UNIVERSITY
DEPARTMENT OF BIOMEDICAL INFORMATICS, THE OHIO STATE UNIVERSITY UMIT{at}BMI.OSU.EDU
DEPARTMENT OF BIOMEDICAL INFORMATICS, THE OHIO STATE UNIVERSITY
INSTITUTE FOR GEOPHYSICS, UNIVERSITY OF TEXAS AT AUSTIN
DEPARTMENT OF BIOMEDICAL INFORMATICS, THE OHIO STATE UNIVERSITY The ability to query and process very large, terabyte-scale datasets has become a key step in many scientific and engineering applications. In this paper, we describe the application of two middleware frameworks in an integrated fashion to provide a scalable and efficient system for execution of seismic data analysis on large datasets in a distributed environment. We investigate different strategies for efficient querying of large datasets and parallel implementations of a seismic image reconstruction algorithm. Our results on a state-of-the-art mass storage system coupled with a high-end compute cluster show that our implementation is scalable and can achieve about 2.9 Gigabytes per second data processing rate about 70% of the maximum 4.2GB/s application-level raw I/O bandwidth of the storage platform.
Key Words: Seismic Data Analysis Data-Driven Applications
|