Skip to main content

Understanding the Impact of X86/NT Computing on Microarchitecture

  • Chapter
Workload Characterization of Emerging Computer Applications

Abstract

Many performance evaluation studies in computer architecture rely almost exclusively on simulation of the dynamic instruction stream from a single application. The benchmarks used are often CPU intensive and rely very little on the operating system, such as the SPEC benchmarks. However, a majority of computer systems are subject to a different class of workloads where these common practices may not accurately reflect all performance issues. For example, operating system activity and context switches are ignored because many popular simulators and tracing techniques do not support the additional complexity.

The main goal of the research is to understand the effects on the microarchitecture of operating system calls and context switches in a common computing environment. This work analyzes applications running in the ubiquitous Microsoft Windows environment using an x86 processor. Microarchitecture structures such as the instruction and data caches, TLB, and branch predictor are investigated in detail. The behavior of application and operating system code is studied to derive a complete picture of the execution behavior of these applications. In addition, a series of desktop and database applications are presented and compared with the SPEC CPU2000 suite. This analysis is conducted using a hardware tracer capable of tracing all activity including operating system calls and context switches.

We observe that the dynamic instruction stream of desktop and database applications contain 19% to 78% operating system activity whereas SPEC2000 applications typically involve less than 1% operating system activity. Not only are there more operating system calls, the average number of instructions executed on each entry into the operating system is higher for desktop and database applications. Data generated by the operating system and applications can interfere with each other. This results in more misses in the caches, more interference in the branch predictor, and worse TLB performance. We find that simulations with application code alone are not ideal for evaluating performance of microarchitecture enhancements for many programs, especially databases and desktop applications. Simulators and tracers capable of handling all system activity are essential for obtaining meaningful results for typical applications that interact with the operating system and for applications in a multiple-program environment.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Standard Performance Evaluation Corporation, “SPEC CPU95 Benchmark.” http://www.spec.org/osg/cpu95.

  2. J. Casmira, D. Kaeli, and D. Hunter, “Tracing and characterization of nt-based system workloads,” in Digital Technical Journal Special Issue on Tools and Languages, pp. 6–21, Dec 1998.

    Google Scholar 

  3. J. B. Chen, Y. Endo, K. Chan, A. Diaz, M. Seltzer, and M. Smith, “The measured performance of personal computer operating systems,” in Proceedings of the 15th Symposium on Operating Systems Principles (SOSP), pp. 145299-313, Aug 1995.

    Google Scholar 

  4. D. C. Lee, P. J. Crowley, J. Baer, T. E. Anderson, and B. N. Bershad, “Execution characteristics of desktop applications on windows nt,” in Proceedings of the 25th International Symposium on Computer Architecture (ISCA), pp. 27–38, Jun 1998.

    Google Scholar 

  5. Y. Endo, Z. Wang, J. B. Chen, and M. I. Seltzer, “Using latency to evaluate interactive system performance,” in Proceedings of the 2nd USENIX Symposium on Operating Systems Design and Implementation (OSDI), pp. 185–199, Oct 1996.

    Chapter  Google Scholar 

  6. J. Casmira, J. Fraser, D. Kaeli, and W. Meleis, “Operating system impact on trace driven simulation,” in Proceedings of the 31st Simulation Symposium, pp. 76–82, Apr 1998.

    Chapter  Google Scholar 

  7. N. Gloy, C. Young, J. Bradley, and M. D. Smith, “An analysis of dynamic branch prediction schemes on system workloads,” in Proceedings of the 23rd International Symposium on Computer Architecture (ISCA), pp. 12–21, May 1996.

    Google Scholar 

  8. M. Rosenblum, E. Bugnion, S. A. Herrod, E. Witchel, and A. Gupta, “The impact of architectural trends on operating system performance,” in Proceedings of the 15th Symposium on Operating Systems Principles (SOSP), pp. 285–298, Dec 1995.

    Google Scholar 

  9. M. C. Merten, A. R. Trick, C. N. George, J. C. Gyllenhaal, and W. W. Hwu, “A hardware-driven profiling scheme for identifying hot spots to support runtime optimization,” in Proceedings of the 26th International Symposium on Computer Architecture (ISCA), pp. 136–147, May 1999.

    Google Scholar 

  10. M. C. Merten, A. R. Trick, E. M. Nystrom, R. D. Barnes, and W. W. Hwu, “A hardware mechanism for dynamic extraction and re-layout of program hot spots,” in Proceedings of the 27th International Symposium on Computer Architecture (ISCA), pp. 59–70, Jun 2000.

    Google Scholar 

  11. J. L. Henning, “SPEC cpu 2000: Measuring cpu performance in the new millennium” in IEEE Computer, pp. 28 1–35, July 2000.

    Google Scholar 

  12. A. M. Maynard, C. Donnelly, and B. Olszewski, “Contrasting characteristics and cache performance of technical and multi-user commercial workloads,” in Proceedings of the 6th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pp. 145–155, Oct 1994.

    Chapter  Google Scholar 

  13. K. Keeton, D. Patterson, Y. He, R. Raphael, and W. Baker, “Performance characterization of a quad pentium pro smp using oltp workloads,” in Proceedings of the 25th Annual International Symposium on Computer Architecture (ISCA), pp. 15–26, Jun 1998.

    Google Scholar 

  14. D. Bhandarkar and J. Ding, “Performance characteristics of the pentium pro processor,” in Proceedings of the 3rd International Symposium on High Performance Computer Architecture (HPCA), pp. 288–297, Feb 1997.

    Chapter  Google Scholar 

  15. D. Talla and L. John, “Execution Characteristics of Multimedia Applications on a Pentium II Processor,” in Proceedings of the International Performance Computing and Communication Conference (IPCCC), pp. 516–524, Feb 2000.

    Google Scholar 

  16. A. Agarwal, J. Hennessy, and M. Horowitz, “Cache performance of operating system and multiprogramming workloads,” ACM Transactions on Computer Systems, vol. 6, pp. 393–431, Nov 1988.

    Article  Google Scholar 

  17. J. C. Mogul and A. Borg, “The effect of context switches on cache performance,” Tech. Rep. TN-16, Digital Western Research Lab, Palo Alto, CA, USA, Dec 1990.

    Google Scholar 

  18. M. Evers, P. Chang, and Y. N. Patt, “Using hybrid branch predictors to improve branch prediction accuracy in the presence of context switches,” in Proceedings of the 23rd International Symposium on Computer Architecture (ISCA), pp. 3–11, May 1996.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2001 Springer Science+Business Media New York

About this chapter

Cite this chapter

Bhargava, R., Rubio, J., Kannan, S., John, L.K., Christie, D., Klaes, L. (2001). Understanding the Impact of X86/NT Computing on Microarchitecture. In: John, L.K., Maynard, A.M.G. (eds) Workload Characterization of Emerging Computer Applications. The Springer International Series in Engineering and Computer Science, vol 610. Springer, Boston, MA. https://doi.org/10.1007/978-1-4615-1613-2_10

Download citation

  • DOI: https://doi.org/10.1007/978-1-4615-1613-2_10

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-1-4613-5641-7

  • Online ISBN: 978-1-4615-1613-2

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics