Abstract
As critical infrastructures in the Internet, data centers have evolved to include hundreds of thousands of servers in a single facility to support data- and computing-intensive applications. For such large-scale systems, it becomes a great challenge to design an interconnection network that provides high capacity, low complexity, and low latency. The traditional approach is to build a hierarchical packet network using switches and routers. This approach has scalability problems in the aspects of wiring, control, and latency. We tackle the challenge by designing a novel switch architecture that supports direct interconnection of a huge number of server racks and provides Petabit switching capacity. Our design combines the best features of electronics and optics. Exploiting recent advances in optics, we propose to build a bufferless optical switch fabric that includes interconnected arrayed waveguide grating routers (AWGRs) and tunable wavelength converters (TWCs). The optical fabric is integrated with electronic buffering and control to perform high-speed switching with nanosecond-level reconfiguration overhead. In particular, our architecture reduces the wiring complexity from O(N) to O(sqrt(N)). We design a practical and scalable scheduling algorithm to achieve high throughput under various traffic load. We also discuss implementation issues to justify the feasibility of this design. Simulation results show that our design achieves good throughput and delay performance.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Akimoto R, Gozu S, Mozume T, Akita K, Cong G, Hasama T, Ishikawa H (2009) All-optical wavelength conversion at 160Gb/s by intersubband transition switches utilizing efficient XPM in InGaAs/AlAsSb coupled double quantum well. In: European conference on optical communication, pp 1 –2, 20–24
Al-Fares M, Loukissas A, Vahdat A (2008) A scalable, commodity data center network architecture. In: SIGCOMM ’08: Proceedings of the ACM SIGCOMM 2008 conference on data communication. ACM, New York, pp 63–74
Anderson TE, Owicki SS, Saxe JB, Thacker CP (1993) High speed switch scheduling for local area networks. ACM Trans Comp Syst 11:319–352
Bach A (2009) High Speed Networking and the race to zero. Keynote speech, 2009 IEEE Symposium on High Performance Interconnects. ISBN: 978-0-7695-3847-1
Batcher K (1968) Sorting networks and their applications. In: American Federation of Information Processing Societies conference proceedings, pp 307–314
Bernasconi P, Zhang L, Yang W, Sauer N, Buhl L, Sinsky J, Kang I, Chandrasekhar S, Neilson D (2006) Monolithically integrated 40-Gb/s switchable wavelength converter. J Lightwave Technol 24(1):71–76
Chang C-S, Lee D-S, Lien C-M (2001) Load balanced Birkhoff-von Neumann switches with resequencing. SIGMETRICS Perform Eval Rev 29(3):23–24
Chao H (2000) Saturn: a Terabit packet switch using dual round robin. IEEE Comm Mag 38(12):78–84
Chao HJ, Liu B (2007) High performance switches and routers. Wiley-IEEE Press. ISBN: 978-0-470-05367-6, Hoboken, New Jersey
Chao HJ, soo Park J (1998) Centralized contention resolution schemes for a large-capacity optical ATM switch. In: Proceedings of IEEE ATM Workshop, pp 11–16
Cisco (2007) Cisco Data Center infrastructure 2.5 design guide. Cisco Systems, Inc.
Cole R, Hopcroft J (1982) On edge coloring bipartite graph. SIAM J Comput 11(3):540–546
Danger JL, Guilley S, Hoogvorst P (2009) High speed true random number generator based on open loop structures in FPGAs. Microelectron J 40(11):1650–1656
Dean J (2009) Large-scale distributed systems at Google: current systems and future directions. In: LADIS ’09: ACM SIGOPS international workshop on large scale distributed systems and middleware. Keynote speech, available online at www.cs.cornell.edu/projects/ladis2009/talks/dean-keynote-ladis2009.pdf Accessed Sep 2012
Dean J, Ghemawat S (2008) MapReduce: simplified data processing on large clusters. Comm ACM 51(1):107–113
Farrington N, Rubow E, Vahdat A (2009) Data center switch architecture in the age of merchant silicon. In: 7th IEEE Symposium on High Performance Interconnects (HOTI) pp 93–102
Farrington N, Porter G, Radhakrishnan S, Bazzaz HH, Subramanya V, Fainman Y, Papen G, Vahdat A (2010) Helios: A hybrid electrical/optical switch architecture for modular data centers. In: SIGCOMM ’10: Proceedings of the ACM SIGCOMM 2010 conference on data communication. ACM, New York
Greenberg A, Hamilton JR, Jain N, Kandula S, Kim C, Lahiri P, Maltz DA, Patel P, Sengupta S (2009) VL2: a scalable and flexible data center network. In: SIGCOMM ’09: Proceedings of the ACM SIGCOMM 2009 conference on data communication. ACM, New York, pp 51–62
Guo C, Wu H, Tan K, Shi L, Zhang Y, Lu S (2008) DCell: A scalable and fault-tolerant network structure for data centers. In: SIGCOMM ’08: Proceedings of the ACM SIGCOMM 2008 conference on data communication. ACM, New York, pp 75–86
Guo C, Lu G, Li D, Wu H, Zhang X, Shi Y, Tian C, Zhang Y, Lu S (2009) BCube: a high performance, server-centric network architecture for modular data centers. In: SIGCOMM ’09: Proceedings of the ACM SIGCOMM 2009 conference on data communication. ACM, New York, pp 63–74
Hawkins C, Small BA, Wills DS, Bergman K (2007) The data vortex, an all optical path multicomputer interconnection network. IEEE Trans Parallel Distrib Syst 18(3):409–420
Hopcroft J, Karp R (1973) An n 5 ∕ 2 algorithm for maximum matchings in bipartite graphs. SIAM J Comput 2(4):225–231
Iyer S, Kompella R, McKeown N (2008) Designing packet buffers for router linecards. IEEE/ACM Trans Networking 16(3):705–717
Juniper (2010) Network fabrics for the modern data center. White Paper, Juniper Networks, Inc.
Keslassy I (2004) The load-balanced router. PhD thesis, Stanford University, Stanford, CA, USA. Adviser-Mckeown, Nick
Keslassy I, Chuang S-T, Yu K, Miller D, Horowitz M, Solgaard O, McKeown N (2003) Scaling internet routers using optics. In: SIGCOMM ’03: Proceedings of the ACM SIGCOMM 2003 conference on data communication. ACM, New York, pp 189–200
Li Y, Panwar S, Chao H (2001) On the performance of a dual round-robin switch. In: IEEE INFOCOM, vol 3, pp 1688–1697
Liao Y, Yin D, Gao L (2010) DPillar: scalable dual-port server interconnection for data center networks. In: IEEE International Conference on Computer Communications and Networks (ICCCN), pp 1–6
Luijten R, Grzybowski R (2009) The OSMOSIS optical packet switch for supercomputers. In: Conference on Optical Fiber Communication OFC 2009. pp 1–3
Mahony FO et al (2010) A 47times10 Gb/s 1.4 mW/(Gb/s) Parallel Interface in 45 nm CMOS. In: IEEE international solid-state circuits conference 45(12):2828–2837
McKeown N (1999) The iSLIP scheduling algorithm for input-queued switches. IEEE/ACM Trans Networking 7(2):188–201
McKeown N, Mekkittikul A, Anantharam V, Walrand J (1999) Achieving 100% throughput in an input-queued switch. IEEE Trans Comm 47(8):1260–1267
Meng X, Pappas V, Zhang L (2010) Improving the scalability of data center networks with traffic-aware virtual machine placement. In: IEEE INFOCOM, pp 1 –9, 14–19
Miller R (2008) Microsoft: 300,000 servers in container farm. http://www.datacenter knowledge.com/archives/2008/05/07/microsoft-300000-servers-in-container-farm. Accessed May 2008
Miller R (2009) Who has the most web servers? http://www.datacenterknowledge.com/archives/2009/05/14/whos-got-the-mos%t-web-servers. Accessed May 2009
Minkenberg C, Abel F, Muller P, Krishnamurthy R, Gusat M, Dill P, Iliadis I, Luijten R, Hemenway R, Grzybowski R, Schiattarella E (2006) Designing a crossbar scheduler for HPC applications. IEEE Micro 26(3):58–71
Miyazaki Y, Miyahara T, Takagi K, Matsumoto K, Nishikawa S, Hatta T, Aoyagi T, Motoshima K (2006) Polarization-insensitive SOA-MZI monolithic all-optical wavelength converter for full C-band 40Gbps-NRZ operation. In: European conference on optical communication, pp 1–2, 24–28
Niranjan Mysore R, Pamboris A, Farrington N, Huang N, Miri P, Radhakrishnan S, Subramanya V, Vahdat A (2009) PortLand: a scalable fault-tolerant layer 2 data center network fabric. In: SIGCOMM ’09: Proceedings of the ACM SIGCOMM 2009 conference on data communication. ACM, New York, pp 39–50
Pina J, Silva H, Monteiro P, Wang J, Freude W, Leuthold J (2007) Performance evaluation of wavelength conversion at 160 Gbit/s using XGM in quantum-dot semiconductor optical amplifiers in MZI configuration. In: Photonics in switching, 2007, pp 77 –78, 19–22
Sudan R, Mukai W (1994) Introduction to the Cisco CRS-1 carrier routing system. Cisco Systems, Inc. White Paper
Wang G, Andersen DG, Kaminsky M, Papagiannaki K, Ng TSE, Kozuch M, Ryan M (2010) c-Through: part-time optics in data centers. In: SIGCOMM ’10: Proceedings of the ACM SIGCOMM 2010 conference on data communication. ACM, New York
Xue F, Ben Yoo S (2004) High-capacity multiservice optical label switching for the next-generation Internet. IEEE Comm Mag 42(5):S16–S22
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer Science+Business Media New York
About this chapter
Cite this chapter
Xi, K., Kao, YH., Chao, H.J. (2013). A Petabit Bufferless Optical Switch for Data Center Networks. In: Kachris, C., Bergman, K., Tomkos, I. (eds) Optical Interconnects for Future Data Center Networks. Optical Networks. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-4630-9_8
Download citation
DOI: https://doi.org/10.1007/978-1-4614-4630-9_8
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-4629-3
Online ISBN: 978-1-4614-4630-9
eBook Packages: EngineeringEngineering (R0)