Performance and Overhead Measurements

Part of the Undergraduate Topics in Computer Science book series (UTICS)

Abstract

The previous sections used a set of parameters (overheads, number of processors, etc) to analyze and predict the performance of parallel programs. This section addresses the problem of measuring those and other important parameters, for a given shared memory machine. These parameters are not likely to be found in the manuals, since the combination of the operating system with the hardware is not usually documented (it depends for instance, on the underlying compiler). In this section we use a specific multicore parallel machine called MC, however the proposed methods should work for other machines as well.

Keywords

Coherence Verse 

References

  1. Archibald, J., Baer, J.L.: Cache coherence protocols: Evaluation using a multiprocessor simulation model. ACM Trans. Comput. Syst. 4(4), 273–298 (1986) CrossRefGoogle Scholar
  2. Burkhart, H., Millen, R.: Performance-measurement tools in a multiprocessor environment. IEEE Trans. Comput. 38(5), 725–737 (2002) CrossRefGoogle Scholar
  3. David, F.M., Carlyle, J.C., Campbell, R.H.: Context switch overheads for Linux on ARM platforms. In: Proceedings of the 2007 Workshop on Experimental Computer Science, p. 3. ACM, New York (2007) CrossRefGoogle Scholar
  4. DeRose, L., Mohr, B., Seelam, S.: Profiling and tracing OpenMP applications with POMP based monitoring libraries. In: Euro-Par 2004 Parallel Processing, pp. 39–46. Springer, Berlin (2004) CrossRefGoogle Scholar
  5. Eggers, S.J., Katz, R.H.: The effect of sharing on the cache and bus performance of parallel programs. In: Proceedings of the Third International Conference on Architectural Support for Programming Languages and Operating Systems, pp. 257–270. ACM, New York (1989). ISBN 0897913000 CrossRefGoogle Scholar
  6. Hristea, C., Lenoski, D., Keen, J.: Measuring memory hierarchy performance of cache-coherent multiprocessors using micro benchmarks. In: ACM/IEEE 1997 Conference Supercomputing, p. 45 (1997) Google Scholar
  7. Jost, G., Jin, H., an Mey, D., Hatay, F.F.: Comparing the openmp, mpi, and hybrid programming paradigms on an smp cluster. In: Proceedings of EWOMP, vol. 3 (2003) Google Scholar
  8. Krawezik, G.: Performance comparison of MPI and three OpenMP programming styles on shared memory multiprocessors. In: Proceedings of the Fifteenth Annual ACM Symposium on Parallel Algorithms and Architectures, pp. 118–127. ACM, New York (2003). ISBN 1581136617 CrossRefGoogle Scholar
  9. Llanos, D.R.: TPCC-UVa: an open-source TPC-C implementation for global performance measurement of computer systems. SIGMOD Rec. 35(4), 6–15 (2006) CrossRefGoogle Scholar
  10. Mogul, J.C., Borg, A.: The effect of context switches on cache performance. Comput. Archit. News 19(2), 75–84 (1991) CrossRefGoogle Scholar
  11. Mohr, B., Malony, A.D., Shende, S., Wolf, F.: Design and prototype of a performance tool interface for OpenMP. J. Supercomput. 23(1), 105–128 (2001) CrossRefGoogle Scholar
  12. Peng, L., Peir, J.K., Prakash, T.K., Staelin, C., Chen, Y.K., Koppelman, D.: Memory hierarchy performance measurement of commercial dual-core desktop processors. J. Syst. Archit. 54(8), 816–828 (2008) CrossRefGoogle Scholar
  13. Wilson Jr., A.W.: Hierarchical cache/bus architecture for shared memory multiprocessors. In: Proceedings of the 14th Annual International Symposium on Computer Architecture, pp. 244–252. ACM, New York (1987). ISBN 0818607769 CrossRefGoogle Scholar
  14. Winiecki, W., Bilski, P.: Multi-core programming approach in the real-time virtual instrumentation. In: Instrumentation and Measurement Technology Conference Proceedings, IMTC 2008, pp. 1031–1036. IEEE, New York (2008) CrossRefGoogle Scholar

Copyright information

© Springer-Verlag London 2012

Authors and Affiliations

  1. 1.Department of Computer ScienceUniversity of HaifaHaifaIsrael

Personalised recommendations