Abstract
New approaches for data provenance and data management (DPDM) are required for mega science projects like the Square Kilometer Array, characterized by extremely large data volume and intense data rates, therefore demanding innovative and highly efficient computational paradigms. In this context, we explore a stream-computing approach with the emphasis on the use of accelerators. In particular, we make use of a new generation of high performance stream-based parallelization middleware known as InfoSphere Streams. Its viability for managing and ensuring interoperability and integrity of signal processing data pipelines is demonstrated in radio astronomy.
IBM InfoSphere Streams embraces the stream-computing paradigm. It is a shift from conventional data mining techniques (involving analysis of existing data from databases) towards real-time analytic processing. We discuss using InfoSphere Streams for effective DPDM in radio astronomy and propose a way in which InfoSphere Streams can be utilized for large antennae arrays. We present a case-study: the InfoSphere Streams implementation of an autocorrelating spectrometer, and using this example we discuss the advantages of the stream-computing approach and the utilization of hardware accelerators.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Walker, R.: What the VLBA Can Do for You: Capabilities, Sensitivity, Resolution, and Image Quality. In: Zensus, J.A., Diamond, P.J., Napier, P.J. (eds.) Very Long Baseline Interferometry and the VLBA. ASP Conference Series, vol. 82, pp. 133–157 (1995)
Weston, S.: Development of Very Long Baseline Interferometry (VLBI) techniques in New Zealand: Array simulation, image synthesis and analysis. M.Phil thesis, Auckland University of Technology (2008), http://hdl.handle.net/10292/449
van der Schaaf, K., Broekema, C., Diepen, G., Meijeren, E.: The LOFAR central processing facility architecture. Experimental Astronomy 17(1-3), 43–58 (2004)
Varbanescu, A.L., van Amesfoort, A., Cornwell, T., van Diepen, G., van Nieuwpoort, R., Elmegreen, B., Sips, H.: Building high-resolution sky images using the Cell/B.E. Scientific Programming 17, 113–134 (2009)
IBM Corporation. System Infrastructure for Streaming (2009), Retrieved from http://domino.research.ibm.com/comm/research_projects.nsf/pages/esps.Projects.html
Bollard, C., Farrell, D.M., Lee, M., Stone, P.D., Thibault, S., Tucker, S.: IBM InfoSphere Streams: Harnessing Data in Motion. IBM Redbooks: International Business Machines Corporation (2010), http://www.redbooks.ibm.com/redbooks/pdfs/sg247865.pdf
Turaga, D.S., Verscheure, O., Chaddhari, U.V., Amini, L.D.: Resource Management for Networked Classifiers in Distributed Stream Mining Systems. In: Proceedings of the Sixth International Conference on Data Mining, pp. 1102–1107 (2006), doi:10.1109/ICDM.2006.136
Jain, N., Amini, L., Andrade, H., King, R., Park, Y., Selo, P., Venkatramani, C.: Design, implementation, and evaluation of the linear road benchmark on the stream processing core. In: Proceedings of the ACM SIGMOD International Conference on Management of Data (2007), doi:10.1145/1142473.1142522
Gedik, B., Andrade, H., Wu, K., Yu, P.S., Doo, M.: SPADE: the system S declarative stream processing engine. In: Proceedings of the ACM SIGMOD International Conference on Management of Data (2008), doi:10.1145/1376616.1376729
IBM Corporation. System S - Stream Computing at IBM Research (2009), Retrieved from http://public.dhe.ibm.com/software/data/sw-library/ii/whitepaper/SystemS_2008-1001.pdf
Andersson, J., Ericsson, M., Löwe: An Adaptive High-Performance Service Architecture. Software Technology Group, MSI. Växjö universitet (2003)
Culler, D., Karp, R., Patterson, D., Sahay, A., Schauser, K.E., Santos, E., Subramonian, R., von Eicken, T.: LogP: Towards a Realistic Model of Parallel Computation. In: 4th ACM PPOPP,5/93/CA, USA (1993)
Gedik, B., Andrade, H., Wu, K.: A Code Generation Approach to Optimizing Distributed Data Stream Processing. In: ACM CIKM 2009, Hong Kong, China, Novemebr 2-6 (2009)
Guzman, J.C., Humphreys, B.: The Australian SKA Pathfinder (ASKAP) Software Architecture. In: Proceedings of SPIE, vol. 7740, p. 77401J (2010)
VEX File Definition. VLBI Standards & Resources Website, Retrieved from http://vlbi.org/vex/
McLaughlin, M.: Rotating Radio Transients. In: Becker, W. (ed.) Neutron Stars and Pulsars, pp. 41–66. Springer, Berlin (2009)
Refsdal, S.: The gravitational lens effect. Monthly Notices of the Royal Astronomical Society 128, 295 (1964)
Daldorff, L.K.S., Mohammadi, S.M., Bergman, J.E.S., Thide, B., Biem, A., Elmegreen, B., Turaga, D.S.: Novel data stream techniques for real time HF radio weather statistics and forecasting. In: Proceedings of IRTS, Edinburgh, UK, April 28-30 (2009) ISBN: 978 1 84919 123 4
Biem, A., Elmegreen, B., Verscheure, O., Turaga, D., Andrade, H., Cornwell, T.: A streaming approach to radio astronomy imaging. In: Proceedings of IEEE ICASSP, pp. 1654–1657 (2010), doi:10.1109/ICASSP.2010.5495521
Rohlfs, K., Wilson, T.L.: Tools of Radio Astronomy, 4th edn., pp. 50–52. Springer, Heidelberg
Bunton, J.D.: New Generation Correlators. In: Proceedings of the XXVIIth General Assembly of International Union Radio Science (URSI), Commission J06, Vigyan Bhavan, New Delhi, India, October 23-29 (2005)
Arevalo, A., Matinata, R.M., Pandian, M., Peri, E., Ruby, K., Thomas, F., Almond, C.: Programming the Cell Broad Engine Architecture: Examples and Best Practices, 1st edn. IBM Redbooks: International Business Machines Corporation (2008)
Lu, J., Nobels, A., Perrone, M.: IBM Research Report: Accelerating FFT Performance Using the Cell BE Processor. T. J. Watson Research Center, Yorktown Heights. IBM Research Division, New York (2007)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Mahmoud, M.S., Ensor, A., Biem, A., Elmegreen, B., Gulyaev, S. (2013). Data Provenance and Management in Radio Astronomy: A Stream Computing Approach. In: Liu, Q., Bai, Q., Giugni, S., Williamson, D., Taylor, J. (eds) Data Provenance and Data Management in eScience. Studies in Computational Intelligence, vol 426. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-29931-5_6
Download citation
DOI: https://doi.org/10.1007/978-3-642-29931-5_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-29930-8
Online ISBN: 978-3-642-29931-5
eBook Packages: EngineeringEngineering (R0)