Abstract
This paper introduces the DUP System, a simple framework for parallel stream processing. The DUP System enables developers to compose applications from stages written in almost any programming language and to run distributed streaming applications across all POSIX-compatible platforms. Parallel applications written with the DUP System do not suffer from many of the problems that exist in traditional parallel languages. The DUP System includes a range of simple stages that serve as general-purpose building blocks for larger applications. This work describes the DUP Assembly language, the DUP architecture and some of the stages included in the DUP run-time library. We then present our experiences with parallelizing and distributing the ARB project, a package of tools for RNA/DNA sequence database handling and analysis.
Chapter PDF
Similar content being viewed by others
References
Lindholm, E., Nickolls, J., Oberman, S., Montrym, J.: NVIDIA Tesla: A unified graphics and computing architecture. IEEE Micro 28, 39–55 (2008)
Flachs, B., Asano, S., Dhong, S.H., Hofstee, P., Gervias, G., Kim, R., Le, T., Liu, P., Leenstra, J., Liberty, J., Michael, B., Oh, H., Mueller, S.M., Takahashi, O., Hatakeyama, A., Wantanbe, Y., Yano, N.: A stream processing unit for a cell processor. In: IEEE International Solid-State Circuits Conference, pp. 134–135 (2005)
Quigley, E.: UNIX Shells, 4th edn. Prentice Hall, Englewood Cliffs (2004)
Grothoff, C., Keene, J.: The DUP protocol specification v2.0. Technical report, The DUP Project (2010)
Hartmann, J.P.: CMS Pipelines Explained. IBM Denmark (2007), http://vm.marist.edu/~pipeline/
IBM: CMS Pipelines User’s Guide. version 5 release 2 edn. IBM Corp. (2005), http://publibz.boulder.ibm.com/epubs/pdf/hcsh1b10.pdf
Goebelbecker, E.: Using grep: Moving from DOS? Discover the power of this Linux utility. Linux Journal (1995)
Dougherty, D.: Sed and AWK. Reilly & Associates, Inc., Sebastopol (1991)
Ryoo, S., Rodrigues, C.I., Baghsorkhi, S.S., Stone, S.S., Kirk, D.B., Mei, W., Hwu, W.W.: Optimization principles and application performance evaluation of a multithreaded GPU using CUDA. In: PPoPP 2008: Proceedings of the 13th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pp. 73–82. ACM, New York (2008)
Nordberg, E.K.: YODA: selecting signature oligonucleotides. Bioinformatics 21, 1365–1370 (2005)
Linhart, C., Shamir, R.: The degenerate primer design problem. Bioinformatics 18(Suppl. 1), S172–S181 (2002)
Kaderali, L., Schliep, A.: Selecting signature oligonucleotides to identify organisms using DNA arrays. Bioinformatics 18, 1340–1349 (2002)
Ludwig, W., Strunk, O., Westram, R., Richter, L., Meier, H., Yadhukumar, Buchner, A., Lai, T., Steppi, S., Jobb, G., Förster, W., Brettske, I., Gerber, S., Ginhart, A.W., Gross, O., Grumann, S., Hermann, S., Jost, R., König, A., Liss, T., Lüssmann, R., May, M., Nonhoff, B., Reichel, B., Strehlow, R., Stamatakis, A., Stuckmann, N., Vilbig, A., Lenke, M., Ludwig, T., Bode, A., Schleifer, K.H.: ARB: a software environment for sequence data. Nucleic Acids Research 32, 1363–1371 (2004)
Shendure, J., Ji, H.: Next-generation DNA sequencing. Nat. Biotechnol. 26, 1135–1145 (2008)
Klug, T.: Hardware of the InfiniBand Cluster
Pruesse, E., Quast, C., Knittel, K., Fuchs, B.M., Ludwig, W., Peplies, J., Glöckner, F.O.: SILVA: a comprehensive online resource for quality checked and aligned ribosomal RNA sequence data compatible with ARB. Nucleic Acids Research 35, 7188–7196 (2007)
Loy, A., Maixner, F., Wagner, M., Horn, M.: probeBase – an online resource for rRNA-targeted oligonucleotide probes: new features 2007. Nucleic Acids Research 35 (2007)
Kahn, G.: The semantics of a simple language for parallel programming. Information Processing, 993–998 (1974)
Parks, T.M.: Bounded Scheduling of Process Networks. PhD thesis, University of California, Berkeley (1995)
Giacomoni, J., Moseley, T., Vachharajani, M.: Fastforward for efficient pipeline parallelism: a cache-optimized concurrent lock-free queue. In: PPoPP 2008: Proceedings of the 13th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pp. 43–52. ACM, New York (2008)
Thies, W., Karczmarek, M., Amarasinghe, S.P.: Streamit: A language for streaming applications. In: Horspool, R.N. (ed.) CC 2002. LNCS, vol. 2304, pp. 179–196. Springer, Heidelberg (2002)
Spring, J.H., Privat, J., Guerraoui, R., Vitek, J.: Streamflex: high-throughput stream programming in java. SIGPLAN Not. 42, 211–228 (2007)
Lee, E.A.: Ptolemy project (2008), http://ptolemy.eecs.berkeley.edu/
Kudlur, M., Mahlke, S.: Orchestrating the execution of stream programs on multicore platforms. In: PLDI 2008: Proceedings of the 2008 ACM SIGPLAN Conference on Programming Language Design and Implementation, pp. 114–124. ACM, New York (2008)
Hirzel, M., Andrade, H., Gedik, B., Kumar, V., Losa, G., Soule, R., Wu, K.-L.: Spade language specification. Technical report, IBM Research (2009)
Amini, L., Andrade, H., Bhagwan, R., Eskesen, F., King, R., Selo, P., Park, Y., Venkatramani, C.: Spc: A distributed, scalable platform for data mining. In: Workshop on Data Mining Standards, Services and Platforms, DM-SPP (2006)
Isard, M., Budiu, M., Yu, Y., Birrell, A., Fetterly, D.: Dryad: Distributed data-parallel programs from sequential building blocks. In: European Conference on Computer Systems (EuroSys), Lisabon, Portugal, pp. 59–72 (2007)
Gelernter, D., Carriero, N.: Coordination languages and their significance. ACM Commun. 35, 97–107 (1992)
Carriero, N., Gelernter, D.: Linda in context. ACM Commun. 32, 444–458 (1989)
Wells, G.C.: A Programmable Matching Engine for Application Development in Linda. PhD thesis, University of Bristol (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 IFIP International Federation for Information Processing
About this paper
Cite this paper
Bader, K.C. et al. (2010). Distributed Stream Processing with DUP. In: Ding, C., Shao, Z., Zheng, R. (eds) Network and Parallel Computing. NPC 2010. Lecture Notes in Computer Science, vol 6289. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15672-4_21
Download citation
DOI: https://doi.org/10.1007/978-3-642-15672-4_21
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15671-7
Online ISBN: 978-3-642-15672-4
eBook Packages: Computer ScienceComputer Science (R0)