LLM: A Low Latency Messaging Infrastructure for Linux Clusters
In this paper, we develop a messaging infrastructure, called LLM, to arrive at a robust and efficient low latency message passing infrastructure for kernel-to-kernel communication. The main focus is to overcome the high latencies associated with the conventional communication protocol stack management of TCP/IP. The LLM provides a transport protocol that offers high reliability at the fragment level keeping the acknowledgment overhead low given the high reliability levels of the LAN. The system utilizes some of the architectural facilities provided by the Linux kernel specially designed for optimization in the respective areas. Reliability against fragment losses is ensured by using a low overhead negative acknowledgment scheme. The implementation is in the form of loadable modules extending the Linux OS. In a typical implementation on a cluster of two nodes, each of uniprocessor Intel Pentium 400 MHz on a 10/100 Mbps LAN achieved an average round trip latency of .169ms as compared to the .531ms obtained by ICMP (Ping) protocol. A relative comparison of LLM with others is also provided.
KeywordsCluster Node Memory Allocation Original Packet Negative Acknowledgment Object Cache
Unable to display preview. Download preview PDF.
- A. Barak, I. Metrik, Performance of the communication layers of TCP/IP with the Myrinet Gigabit LAN, Computer Communications, Vol. 22, No.11, July 1999, http://www.mosix.cs.huji.ac.il/ftps/com.ps.gz.
- Jeff Bonwick, The slab allocator: An object caching kernel memory allocator, USENIX Summer Tech. Conf., Boston, Mass. 1994.Google Scholar
- Brad Fitzgibbons, Linux slab allocator, http://www.cc.gatech.edu/people/home/bradf/cs7001/proj2/linux_slab.html.
- A. Mainwarning, D. E. Culler, Active Message Application Programming Interface and Communication Subsystem Organization, TR, Univ. of Calif., Berkeley,1995.Google Scholar
- H. Tezuka, A. Hori, Y. Ishikawa, M. Sato, PM: An Operating System Coordinated High Performance Communication Library, in Proc.Int.Conf. on High-Performance Computing and Networking (HPCN Europe 1997), pp. 708–717, April 1997.Google Scholar
- T. von Eicken, A. Basu, W. Vogels, U-Net: A user level network interface for parallel and distributed computing, in Proc.15th ACM Symp on Operating Systems Principle, pp. 40–53, 1995.Google Scholar
- J. Peterson, T. Norman, Buddy Systems, CACM, June 1977.Google Scholar
- Giuseppe Ciaccio A Communication system for Efficient Parallel Processing on Clusters of Personal Computers, PhD Thesis DISI-TH-1999-02, June 1999.Google Scholar