Abstract
NASA’s Goddard Earth Sciences Distributed Active Archive Center (GES DAAC) processes, stores and distributes earth science data from a variety of remote sensing satellites. End users of the data range from instrument scientists to global change and climate researchers to federal agencies and foreign governments. Many of these users apply Knowledge Discovery from Databases (KDD) techniques to large volumes of data (on the order of a terabyte) received from the GES DAAC. However, rapid advances in computer power are enabling increases in data processing that are outpacing tape drive performance and network capacity. As a result, the proportion of data that can be distributed to users continues to decrease. As mitigation, we are migrating more knowledge extraction (e.g., data mining and data reduction) activities into the data center in order to reduce the data volume that needs to be distributed and to offer the users a more useful and manageable product. This migration of activities faces several technical and human-factor challenges. As data reduction and mining algorithms are often quite specific to the user’s research needs, the user’s algorithm must be integrated virtually unchanged into the archive environment. Also, the archive itself is busy with everyday data archive and distribution activities and cannot be dedicated to, or even impacted by, the mining activities. Therefore, we schedule KDD “campaigns”, during which we schedule a wholesale retrieval of specific data products, offering users the opportunity to extract information from the data being retrieved during the campaign.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Behnke, J., E. Dobinson, S. Graves, T. Hinke, D. Nichols, P. Stolorz and P. Newsome, 1999. Final Report on NASA Workshop on Issues in the Application of Data Mining to Scientific Data, NASA/GSFC, 38 p.
Bam, C., R. Moore, A. Rajasekar, M. Wan, 1998. The SDSC Storage Request Broker, Proc. CASCON 98 Conference, Nov. 30-Dec.3, 1998, Toronto, Canada, 12 p., http://www.npaci.edu/DICE/Pubs/srb.ps/DICE/Pubs/srb.ps.
Becker, D. J., T. Sterling, D. Savarese, J. E. Dorband, U. A. Ranawak, C. V. Packer, 1995. Beowulf: A Parallel Workstation for Scientific Computation, Proceedings of the International Conference on Parallel Processing, http://www.beowulf.org/papers/ICPP95/icpp95.html.
Gallagher, J. and G. Milkowski, 1995. Data Transport within the Distributed Oceanographic Data System, http://www.w3.org/Conferences/WWW4/Papers/67/Conferences/WWW4/Papers/67 Fourth International World Wide Web Conference, .
Giglio, L., J. Kendall and C. J. Tucker, 2000. Remote sensing of fires with the TRMM VIRS, International Journal of Remote Sensing, 21, 203–307.
Justice, C. O., J.D. Kendall, P.R. Dowty and R.J. Scholes, 1996. Satellite remote sensing of fires during the SAFARI campaign using NOAA advanced very high resolution radiometer data, J. Geophys. Res., 101, 23851–23864.
Maes, P., R. H. Guttman and A. G. Moukas, 1999. Agents that buy and sell, Communications of the ACM, 42, 81–91.
Ramachandran, R., H. Conover, S. J. Graves, K. Keiser, “Algorithm Development and Mining (ADaM) System for Earth Science Applications,” Second Conference on Artificial Intelligence, 80th AMS Annual Meeting, January, 2000, http://amsxonfex.com/ams/annual2000/10satmet/program.htm.
Rodriguez, M., and N. Roussopoulos, 2000. MOCHA: a self-extensible database middleware system for distributed data sources, Proceedings of the ACM SIGMOD International Conference on Management of Data, 213–224.
Schwarz, T., 2000. Magnetic tape as the mass storage medium, THIC Meeting for June 27 and 28, 2000, http://www.thic.org/Agenda_0600.html.
Wharton, S. and Myers, M. F. (ed.), 1997. MTPE EOS Data Products Handbook, Vol. 1, 270 p., http://eospso.gsfc.nasa.gov/fip_docsZDPH.pdf.
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer Science+Business Media Dordrecht
About this chapter
Cite this chapter
Lynnes, C., Mack, R. (2001). KDD Services at the Goddard Earth Sciences Distributed Active Archive Center. In: Grossman, R.L., Kamath, C., Kegelmeyer, P., Kumar, V., Namburu, R.R. (eds) Data Mining for Scientific and Engineering Applications. Massive Computing, vol 2. Springer, Boston, MA. https://doi.org/10.1007/978-1-4615-1733-7_10
Download citation
DOI: https://doi.org/10.1007/978-1-4615-1733-7_10
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4020-0114-7
Online ISBN: 978-1-4615-1733-7
eBook Packages: Springer Book Archive