Supporting I/O and IPC via fine-grained OS isolation for mixed-criticality real-time tasks

Abstract

Efforts towards hosting safety-critical, real-time applications on multicore platforms have been stymied by a problem dubbed the “one-out-of-m” problem: due to excessive analysis pessimism, the overall capacity of an m-core platform can easily be reduced to roughly just one core. The predominant approach for addressing this problem introduces hardware-isolation techniques that ameliorate contention experienced by tasks when accessing shared hardware components, such as DRAM memory or caches. Unfortunately, in work on such techniques, the operating system (OS), which is a key source of potential interference, has been largely ignored. Most real-time OSs do facilitate the use of a coarse-grained partitioning strategy to separate the OS from user-level tasks. However, such a strategy by itself fails to address any data sharing between the OS and tasks, such as when OS services are required for interprocess communication (IPC) or I/O. This paper presents techniques for lessening the impacts of such sharing, specifically in the context of \({\textsf {MC}}^{\textsf {2}}\), a hardware-isolation framework designed for mixed-criticality systems. Additionally, it presents the results from micro-benchmark experiments and a large-scale schedulability study conducted to evaluate the efficacy of the proposed techniques and also to elucidate sharing vs. isolation tradeoffs involving the OS. This is the first paper to systematically consider such tradeoffs and consequent impacts of OS-induced sharing on the one-out-of-m problem.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15

Notes

  1. 1.

    We use the terms “processor,” “core,” and “CPU” interchangeably.

  2. 2.

    Under \({\textsf {MC}}^{\textsf {2}}\), “PET” is used instead of “WCET” because SRT tasks are not provisioned on a worst-case basis.

  3. 3.

    All cores share two separate connections to the bus arbiter (Freescale 2014, p. 3982), but assigning a separate QoS value to each connection would be largely meaningless because, to our knowledge, there is no way to specify which of the two connections any given memory request will use.

  4. 4.

    Timing analysis for multicore machines is out of scope for this paper. Our measurement-based approach is sufficient to inform realistic execution-time behavior under different resource-allocation policies, which are the focus of this work.

  5. 5.

    By this, we mean that interference due to cache evictions does not occur on our platform as the DMA data pages are marked as uncacheable. However, as mentioned in Sect. 2, overhead due to the coherency protocol (i.e. invalidating cache lines) may still exist.

  6. 6.

    This actually is not the default disk-access behavior in Linux; zero-copy disk I/O requires passing the optional O_DIRECT flag to the open system call.

  7. 7.

    USB devices may not be common in HRT systems; we use a USB camera only as an exemplar of devices where OS activity may cause memory interference.

  8. 8.

    This requires knowledge of worst-case interrupt interarrival and execution times. We assume that we operate in a “closed world,” with a priori knowledge of interrupt types and maximum frequencies, as is typically assumed in real-time overhead accounting.

  9. 9.

    The buddy allocator maintains a list of free blocks per zone. A zone is a group of pages that have similar properties. The hardware platform considered in this paper has only one zone, ZONE_NORMAL. However, other architectures may have multiple zones such as ZONE_DMA, ZONE_NORMAL, and ZONE_HIGHMEM. In such platforms, the buddy allocator can have more than one list (Gorman 2004).

  10. 10.

    The prior work also considers sharing between tasks of different criticality levels, but we chose not to evaluate such sharing in this paper to reduce the complexity of the schedulability study, which already required the addition of several new parameters. However, cross-criticality sharing remains theoretically possible under our new modifications so long as it remains limited to wait-free communication.

  11. 11.

    This was done by modifying calls to kmalloc and similar functions to allocate pages from the per-core high-criticality partitions when necessary via the memory-management interface changes described in Sect. 4.

  12. 12.

    Note that the Load-Generator tasks were not actually executed on the same CPU as Synthetic. Running them all on the same CPU makes obtaining accurate execution-time measurements more difficult, and this experiment was only intended to isolate the impact of DMA-sourced interference on the measured (Synthetic) task.

  13. 13.

    Briefly (and informally), these categories specify: (1) the fraction of the overall workload that exists at each criticality level, (2) task periods, (3) utilizations at each criticality level, and (4) an LLC reload factor used to determine cache-related preemption delays. (2) and (3) Level-A, -B, and -C execution costs model inflated worst-case, worst-case, and average-case execution costs, respectively. (4) is modeled based on measurement data. These details are described in full in previously published papers (Chisholm et al. 2016, 2015; Kim et al. 2017a, b).

  14. 14.

    The “utilization” referred to here is that initially obtained during task-set generation without accounting for \({\textsf {MC}}^{\textsf {2}}\) ’s hardware management, which improves execution times. Thus, it is possible for a task system to have a total utilization exceeding four and be schedulable.

References

  1. Alhammad A, Pellizzoni R (2016) Trading cores for memory bandwidth in real-time systems. In: Proceedings of the 22nd IEEE real-time and embedded technology and applications symposium, pp 1–11

  2. Alhammad A, Wasly S, Pellizzoni R (2015) Memory efficient global scheduling of real-time tasks. In: Proceedings of the 21st IEEE real-time and embedded technology and applications symposium, pp 285–296

  3. Altmeyer S, Douma R, Lunniss W, Davis R (2014) Evaluation of cache partitioning for hard real-time systems. In: Proceedings of the 26th Euromicro conference on real-time systems, pp 15–26

  4. ARM Limited (2009) Application note 228: implementing DMA on ARM SMP systems. http://infocenter.arm.com/help/topic/com.arm.doc.dai0228a/DAI228A_DMA_on_SMP_systems.pdf

  5. ARM Limited (2010) AMBA$\textregistered $ network interconnect (NIC-301): technical reference manual. https://static.docs.arm.com/ddi0397/g/DDI0397G_amba_network_interconnect_nic301_r2p1_trm.pdf

  6. Audsley N (2013) Memory architecture for NoC-based real-time mixed criticality systems. In: Proceedings of the 1st international workshop on mixed criticality systems

  7. Awan MA, Bletsas K, Souto P, Akesson B, Tovar E (2017) Mixed-criticality scheduling with dynamic redistribution of shared cache. In: Proceedings of the 29th Euromicro conference on real-time systems, pp 18:1–18:21

  8. Brandenburg B (2011) Scheduling and locking in multiprocessor real-time operating systems. PhD thesis, University of North Carolina at Chapel Hill, Chapel Hill, NC

  9. Brandenburg B, Leontyev H, Anderson J (2011) An overview of interrupt accounting techniques for multiprocessor real-time systems. J Syst Arch 57(6):638–654

    Article  Google Scholar 

  10. Burns A, Davis R (2019) Mixed criticality systems—a review. Tech. rep. Department of Computer Science, University of York

  11. Certification Authorities Software Team (2016) Position paper CAST-32A: multi-core processors

  12. Chisholm M, Ward B, Kim N, Anderson J (2015) Cache sharing and isolation tradeoffs in multicore mixed-criticality systems. In: Proceedings of the 36th IEEE international real-time systems symposium, pp 305–316

  13. Chisholm M, Kim N, Ward B, Otterness N, Anderson J, Smith FD (2016) Reconciling the tension between hardware isolation and data sharing in mixed-criticality, multicore systems. In: Proceedings of the 37th IEEE international real-time systems symposium, pp 57–68

  14. Chisholm M, Kim N, Tang S, Otterness N, Anderson J, Smith F, Porter D (2017) Supporting mode changes while providing hardware isolation in mixed-criticality multicore systems. In: Proceedings of the 25th international conference on real-time networks and systems, pp 58–67

  15. Erickson J, Kim N, Anderson J (2015) Recovering from overload in multicore mixed-criticality systems. In: Proceedings of the 29th IEEE international parallel and distributed processing symposium, pp 775–785

  16. Freescale (2014) i.MX 6Dual/6Quad Applications Processor Reference Manual. https://www.nxp.com/webapp/Download?colCode=IMX6DQRM

  17. Giannopoulou G, Stoimenov N, Huang P, Thiele L (2013) Scheduling of mixed-criticality applications on resource-sharing multicore systems. In: Proceedings of the 13th international conference on embedded software, pp 1–15

  18. Gorman M (2004) Describing physical memory. In: Understanding the Linux Virtual Memory Manager. Prentice Hall PTR, Upper Saddle River, NJ, USA, chap 2

  19. Guo D, Pellizzoni R (2017) A requests bundling DRAM controller for mixed-criticality systems. In: Proceedings of the 23rd IEEE real-time and embedded technology and applications symposium, pp 247–258

  20. Hassan M, Patel H (2016) Criticality- and requirement-aware bus arbitration for multi-core mixed criticality systems. In: Proceedings of the 22nd IEEE real-time and embedded technology and applications symposium, pp 1–11

  21. Hassan M, Patel H, Pellizzoni R (2015) A framework for scheduling DRAM memory accesses for multi-core mixed-time critical systems. In: Proceedings of the 21st IEEE real-time and embedded technology and applications symposium, pp 307–316

  22. Herman J, Kenna C, Mollison M, Anderson J, Johnson D (2012) RTOS support for multicore mixed-criticality systems. In: Proceedings of the 18th IEEE real-time and embedded technology and applications symposium, pp 197–208

  23. Herter J, Backes P, Haupenthal F, Reineke J (2011) CAMA: a predictable cache-aware memory allocator. In: Proceedings of the 23rd Euromicro conference on real-time systems, pp 23–32

  24. Huang TY, Liu JWS, Chung JY (1996a) Allowing cycle-stealing direct memory access I/O concurrent with hard-real-time programs. In: Proceedings of the 4th international conference on parallel and distributed systems, pp 422–429

  25. Huang TY, Liu JWS, Hull D (1996b) A method for bounding the effect of DMA I/O interference on program execution time. In: Proceedings of the 17th IEEE real-time systems symposium, pp 275–285

  26. Huang TY, Chou CC, Chen PY (2003) Bounding the execution times of DMA I/O tasks on hard-real-time embedded systems. In: Proceedings of the 9th international conference on real-time and embedded computer systems and applications, pp 499–512

  27. Huang TY, Chou CC, Chen PY (2006) Bounding DMA interference on hard-real-time embedded systems. J Inf Sci Eng 22:1229–1247

    Google Scholar 

  28. Jalle J, Quinones E, Abella J, Fossati L, Zulianello M, Cazorla P (2014) A dual-criticality memory controller (DCmc) proposal and evaluation of a space case study. In: Proceedings of the 35th IEEE real-time systems symposium, pp 207–217

  29. Kim N (2019) Combining hardware management with mixed-criticality provisioning in multicore real-time systems. PhD thesis, University of North Carolina at Chapel Hill, Chapel Hill, NC. http://www.cs.unc.edu/~anderson/diss/namhoondiss.pdf

  30. Kim H, Kandhalu A, Rajkumar R (2013) A coordinated approach for practical OS-level cache management in multi-core real-time systems. In: Proceedings of the 25th Euromicro conference on real-time systems, pp 80–89

  31. Kim H, de Niz D, Andersson B, Klein M, Mutlu O, Rajkumar R (2014a) Bounding memory interference delay in COTS-based multi-core systems. In: Proceedings of the 20th IEEE real-time and embedded technology and applications symposium, pp 145–154

  32. Kim J, Yoon M, Bradford R, Sha L (2014b) Integrated modular avionics (IMA) partition scheduling with conflict-free I/O for multicore avionics systems. In: Proceedings of the 38th IEEE annual computer, software, and applications conference, pp 321–331

  33. Kim H, Broman D, Lee E, Zimmer M, Shrivastava A, Oh J (2015) A predictable and command-level priority-based DRAM controller for mixed-criticality systems. In: Proceedings of the 21st IEEE real-time and embedded technology and applications symposium, pp 317–326

  34. Kim N, Chisholm M, Otterness N, Anderson J, Smith FD (2017a) Allowing shared libraries while supporting hardware isolation in multicore real-time systems. In: Proceedings of the 23rd IEEE real-time and embedded technology and applications symposium, pp 223–234

  35. Kim N, Ward B, Chisholm M, Anderson J, Smith FD (2017b) Attacking the one-out-of-m multicore problem by combining hardware management with mixed-criticality provisioning. Real-Time Syst 53(5):709–759

    Article  Google Scholar 

  36. Kim N, Tang S, Otterness N, Anderson J, Smith FD, Porter D (2018) Supporting I/O and IPC via fine-grained OS isolation for mixed-criticality real-time tasks. In: Proceedings of the 26th international conference on real-time networks and systems, pp 191–201

  37. Kim N, Tang S, Otterness N, Anderson J, Smith F, Porter D (2019) Supporting I/O and IPC via fine-grained OS isolation for mixed-criticality real-time tasks. Full version of this paper. http://www.cs.unc.edu/~anderson/papers.html

  38. Knowlton K (1965) A fast storage allocator. Commun ACM 8(10):623–624

    Article  Google Scholar 

  39. Knuth D (1968) The art of computer programming. Addison-Wesley, Reading

    Google Scholar 

  40. Kotaba O, Nowotsch J, Paulitsch M, Petters S, Theiling H (2013) Multicore in real-time systems—temporal isolation challenges due to shared resources. In: Proceedings of the workshop on industry-driven approaches for cost-effective certification of safety-critical, mixed-criticality systems, pp 1–6

  41. Krishnapillai Y, Wu Z, Pellizzoni R (2014) ROC: a rank-switching, open-row DRAM controller for time-predictable systems. In: Proceedings of the 26th Euromicro conference on real-time systems, pp 27–38

  42. Liao X, Guo R, Jin H, Yue J, Tan G (2017) Enhancing the malloc system with pollution awareness for better cache performance. IEEE Trans Parallel Distrib Syst 28(3):731–745

    Article  Google Scholar 

  43. LITMUS\(^{\text{RT}}\) Project (2018) LITMUS\(^{\text{RT}}\): Linux testbed for multiprocessor scheduling in real-time systems. http://www.litmus-rt.org/

  44. Liu L, Cui Z, Xing M, Bao Y, Chen M, Wu C (2012) A software memory partition approach for eliminating bank-level interference in multicore systems. In: Proceedings of the 21st international conference on parallel architectures and compilation techniques, pp 367–376

  45. Mollison M, Erickson J, Anderson J, Baruah S, Scoredos J (2010) Mixed criticality real-time scheduling for multicore systems. In: Proceedings of the 7th IEEE international conference on computer and information technology, pp 1864–1871

  46. Muench D, Paulitsch M, Herkersdorf A (2014) Temporal separation for hardware-based I/O virtualization for mixed-criticality embedded real-time systems using PCIe SR-IOV. In: Proceedings of the workshop on architecture of computing systems, pp 1–7

  47. Muralidhara S, Subramanian L, Mutlu O, Kandemir M, Moscibroda T (2011) Reducing memory interference in multicore systems via application-aware memory channel partitioning. In: Proceedings of the 44th annual IEEE/ACM international symposium on microarchitecture, pp 374–385

  48. Musmanno J (2003) Data intensive systems (DIS) benchmark performance summary

  49. Pellizzoni R, Caccamo M (2010) Impact of peripheral-processor interference on WCET analysis of real-time embedded systems. IEEE Trans Comput 59(3):400–415

    MathSciNet  Article  Google Scholar 

  50. Pellizzoni R, Bui B, Caccamo M (2008a) Coscheduling of CPU and I/O transactions in COTS-based embedded systems. In: Proceedings of the 29th IEEE real-time systems symposium, pp 221–231

  51. Pellizzoni R, Bui B, Caccamo M, Sha L (2008b) Coscheduling of CPU and I/O transactions in COTS-based embedded systems. In: Proceedings of the 29th IEEE real-time systems symposium, pp 221–231

  52. Pellizzoni R, Schranzhofer A, Chen J, Caccamo M, Thiele L (2010) Worst case delay analysis for memory interference in multicore systems. In: Proceedings of the design, automation test in Europe conference exhibition, pp 741–746

  53. Scolari A, Bartolini DB, Santambrogio MD (2016) A software cache partitioning system for hash-based caches. ACM Trans Arch Code Optim 13(4):1–24

    Article  Google Scholar 

  54. Seetanadi G, Camara J, Almeida L, Arzen K, Maggio M (2017) Event-driven bandwidth allocation with formal guarantees for camera networks. In: Proceedings of the 38th IEEE real-time systems symposium, pp 243–254

  55. Tabish R, Mancuso R, Wasly S, Alhammad A, Phatak S, Pellizzoni R, Caccamo M (2016a) A real-time scratchpad-centric OS for multi-core embedded systems. In: Proceedings of the 22nd IEEE real-time and embedded technology and applications symposium, pp 1–11

  56. Tabish R, Mancuso R, Wasly S, Alhammad A, Phatak S, Pellizzoni R, Caccamo M (2016b) A real-time scratchpad-centric OS for multi-core embedded systems. In: Proceedings of the 22nd IEEE real-time and embedded technology and applications symposium, pp 1–11

  57. Valsan P, Yun H, Farshchi F (2016) Taming non-blocking caches to improve isolation in multicore real-time systems. In: Proceedings of the 22nd IEEE real-time and embedded technology and applications symposium, pp 1–12

  58. Vestal S (2007) Preemptive scheduling of multi-criticality systems with varying degrees of execution time assurance. In: Proceedings of the 28th IEEE international real-time systems symposium, pp 239–243

  59. Ward B, Herman J, Kenna C, Anderson J (2013) Making shared caches more predictable on multicore platforms. In: Proceedings of the 25th Euromicro conference on real-time systems, pp 157–167

  60. Xu M, Phan LTX, Choi HY, Lee I (2016) Analysis and implementation of global preemptive fixed-priority scheduling with dynamic cache allocation. In: Proceedings of the 22nd IEEE real-time and embedded technology and applications symposium, pp 1–12

  61. Xu M, Phan LTX, Choi HY, Lee I (2017) vCAT: dynamic cache management using CAT virtualization. In: Proceedings of the 23rd IEEE real-time and embedded technology and applications symposium, pp 211–222

  62. Yun H, Yao G, Pellizzoni R, Caccamo M, Sha L (2012) Memory access control in multiprocessor for real-time systems with mixed criticality. In: Proceedings of the 24th Euromicro conference on real-time systems, pp 299–308

  63. Yun H, Mancuso R, Wu Z, Pellizzoni R (2014) PALLOC: DRAM bank-aware memory allocator for performance isolation on multicore platforms. In: Proceedings of the 20th IEEE real-time and embedded technology and applications symposium, pp 155–166

Download references

Acknowledgements

Work supported by NSF Grants CNS 1409175, CPS 1446631, CNS 1563845, CNS 1717589, ARO Grant W911NF-17-1-0294, ONR Grant N00014-20-1-2698, and funding from General Motors.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Stephen Tang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Kim, N., Tang, S., Otterness, N. et al. Supporting I/O and IPC via fine-grained OS isolation for mixed-criticality real-time tasks. Real-Time Syst (2020). https://doi.org/10.1007/s11241-020-09351-2

Download citation

Keywords

  • Real-time
  • Mixed-criticality
  • Hardware management
  • Multi-core systems
  • I/O
  • Interprocess communication