Supporting I/O and IPC via fine-grained OS isolation for mixed-criticality real-time tasks

Kim, Namhoon; Tang, Stephen; Otterness, Nathan; Anderson, James H.; Smith, F. Donelson; Porter, Donald E.

doi:10.1007/s11241-020-09351-2

Supporting I/O and IPC via fine-grained OS isolation for mixed-criticality real-time tasks

Published: 29 June 2020

Volume 56, pages 349–390, (2020)
Cite this article

Real-Time Systems Aims and scope Submit manuscript

Namhoon Kim¹,
Stephen Tang ORCID: orcid.org/0000-0001-9208-9329¹,
Nathan Otterness¹,
James H. Anderson¹,
F. Donelson Smith¹ &
…
Donald E. Porter¹

270 Accesses
4 Citations
Explore all metrics

Abstract

Efforts towards hosting safety-critical, real-time applications on multicore platforms have been stymied by a problem dubbed the “one-out-of-m” problem: due to excessive analysis pessimism, the overall capacity of an m-core platform can easily be reduced to roughly just one core. The predominant approach for addressing this problem introduces hardware-isolation techniques that ameliorate contention experienced by tasks when accessing shared hardware components, such as DRAM memory or caches. Unfortunately, in work on such techniques, the operating system (OS), which is a key source of potential interference, has been largely ignored. Most real-time OSs do facilitate the use of a coarse-grained partitioning strategy to separate the OS from user-level tasks. However, such a strategy by itself fails to address any data sharing between the OS and tasks, such as when OS services are required for interprocess communication (IPC) or I/O. This paper presents techniques for lessening the impacts of such sharing, specifically in the context of ${\textsf {MC}}^{\textsf {2}}$, a hardware-isolation framework designed for mixed-criticality systems. Additionally, it presents the results from micro-benchmark experiments and a large-scale schedulability study conducted to evaluate the efficacy of the proposed techniques and also to elucidate sharing vs. isolation tradeoffs involving the OS. This is the first paper to systematically consider such tradeoffs and consequent impacts of OS-induced sharing on the one-out-of-m problem.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Balancing Tracking Granularity and Parallelism in Many-Task Systems: The Horizons Approach

Article Open access 06 April 2024

Peter Thoman & Philip Salzmann

Analyzing the impact of various parameters on job scheduling in the Google cluster dataset

Article 29 March 2024

Danyal Shahmirzadi, Navid Khaledian & Amir Masoud Rahmani

Comprehensive analysis of energy efficiency and performance of ARM and RISC-V SoCs

Article Open access 20 February 2024

Daniel Suárez, Francisco Almeida & Vicente Blanco

Notes

We use the terms “processor,” “core,” and “CPU” interchangeably.
Under ${\textsf {MC}}^{\textsf {2}}$, “PET” is used instead of “WCET” because SRT tasks are not provisioned on a worst-case basis.
All cores share two separate connections to the bus arbiter (Freescale 2014, p. 3982), but assigning a separate QoS value to each connection would be largely meaningless because, to our knowledge, there is no way to specify which of the two connections any given memory request will use.
Timing analysis for multicore machines is out of scope for this paper. Our measurement-based approach is sufficient to inform realistic execution-time behavior under different resource-allocation policies, which are the focus of this work.
By this, we mean that interference due to cache evictions does not occur on our platform as the DMA data pages are marked as uncacheable. However, as mentioned in Sect. 2, overhead due to the coherency protocol (i.e. invalidating cache lines) may still exist.
This actually is not the default disk-access behavior in Linux; zero-copy disk I/O requires passing the optional O_DIRECT flag to the open system call.
USB devices may not be common in HRT systems; we use a USB camera only as an exemplar of devices where OS activity may cause memory interference.
This requires knowledge of worst-case interrupt interarrival and execution times. We assume that we operate in a “closed world,” with a priori knowledge of interrupt types and maximum frequencies, as is typically assumed in real-time overhead accounting.
The buddy allocator maintains a list of free blocks per zone. A zone is a group of pages that have similar properties. The hardware platform considered in this paper has only one zone, ZONE_NORMAL. However, other architectures may have multiple zones such as ZONE_DMA, ZONE_NORMAL, and ZONE_HIGHMEM. In such platforms, the buddy allocator can have more than one list (Gorman 2004).
The prior work also considers sharing between tasks of different criticality levels, but we chose not to evaluate such sharing in this paper to reduce the complexity of the schedulability study, which already required the addition of several new parameters. However, cross-criticality sharing remains theoretically possible under our new modifications so long as it remains limited to wait-free communication.
This was done by modifying calls to kmalloc and similar functions to allocate pages from the per-core high-criticality partitions when necessary via the memory-management interface changes described in Sect. 4.
Note that the Load-Generator tasks were not actually executed on the same CPU as Synthetic. Running them all on the same CPU makes obtaining accurate execution-time measurements more difficult, and this experiment was only intended to isolate the impact of DMA-sourced interference on the measured (Synthetic) task.
Briefly (and informally), these categories specify: (1) the fraction of the overall workload that exists at each criticality level, (2) task periods, (3) utilizations at each criticality level, and (4) an LLC reload factor used to determine cache-related preemption delays. (2) and (3) Level-A, -B, and -C execution costs model inflated worst-case, worst-case, and average-case execution costs, respectively. (4) is modeled based on measurement data. These details are described in full in previously published papers (Chisholm et al. 2016, 2015; Kim et al. 2017a, b).
The “utilization” referred to here is that initially obtained during task-set generation without accounting for ${\textsf {MC}}^{\textsf {2}}$ ’s hardware management, which improves execution times. Thus, it is possible for a task system to have a total utilization exceeding four and be schedulable.

References

Alhammad A, Pellizzoni R (2016) Trading cores for memory bandwidth in real-time systems. In: Proceedings of the 22nd IEEE real-time and embedded technology and applications symposium, pp 1–11
Alhammad A, Wasly S, Pellizzoni R (2015) Memory efficient global scheduling of real-time tasks. In: Proceedings of the 21st IEEE real-time and embedded technology and applications symposium, pp 285–296
Altmeyer S, Douma R, Lunniss W, Davis R (2014) Evaluation of cache partitioning for hard real-time systems. In: Proceedings of the 26th Euromicro conference on real-time systems, pp 15–26
ARM Limited (2009) Application note 228: implementing DMA on ARM SMP systems. http://infocenter.arm.com/help/topic/com.arm.doc.dai0228a/DAI228A_DMA_on_SMP_systems.pdf
ARM Limited (2010) AMBA$\textregistered $ network interconnect (NIC-301): technical reference manual. https://static.docs.arm.com/ddi0397/g/DDI0397G_amba_network_interconnect_nic301_r2p1_trm.pdf
Audsley N (2013) Memory architecture for NoC-based real-time mixed criticality systems. In: Proceedings of the 1st international workshop on mixed criticality systems
Awan MA, Bletsas K, Souto P, Akesson B, Tovar E (2017) Mixed-criticality scheduling with dynamic redistribution of shared cache. In: Proceedings of the 29th Euromicro conference on real-time systems, pp 18:1–18:21
Brandenburg B (2011) Scheduling and locking in multiprocessor real-time operating systems. PhD thesis, University of North Carolina at Chapel Hill, Chapel Hill, NC
Brandenburg B, Leontyev H, Anderson J (2011) An overview of interrupt accounting techniques for multiprocessor real-time systems. J Syst Arch 57(6):638–654
Article Google Scholar
Burns A, Davis R (2019) Mixed criticality systems—a review. Tech. rep. Department of Computer Science, University of York
Certification Authorities Software Team (2016) Position paper CAST-32A: multi-core processors
Chisholm M, Ward B, Kim N, Anderson J (2015) Cache sharing and isolation tradeoffs in multicore mixed-criticality systems. In: Proceedings of the 36th IEEE international real-time systems symposium, pp 305–316
Chisholm M, Kim N, Ward B, Otterness N, Anderson J, Smith FD (2016) Reconciling the tension between hardware isolation and data sharing in mixed-criticality, multicore systems. In: Proceedings of the 37th IEEE international real-time systems symposium, pp 57–68
Chisholm M, Kim N, Tang S, Otterness N, Anderson J, Smith F, Porter D (2017) Supporting mode changes while providing hardware isolation in mixed-criticality multicore systems. In: Proceedings of the 25th international conference on real-time networks and systems, pp 58–67
Erickson J, Kim N, Anderson J (2015) Recovering from overload in multicore mixed-criticality systems. In: Proceedings of the 29th IEEE international parallel and distributed processing symposium, pp 775–785
Freescale (2014) i.MX 6Dual/6Quad Applications Processor Reference Manual. https://www.nxp.com/webapp/Download?colCode=IMX6DQRM
Giannopoulou G, Stoimenov N, Huang P, Thiele L (2013) Scheduling of mixed-criticality applications on resource-sharing multicore systems. In: Proceedings of the 13th international conference on embedded software, pp 1–15
Gorman M (2004) Describing physical memory. In: Understanding the Linux Virtual Memory Manager. Prentice Hall PTR, Upper Saddle River, NJ, USA, chap 2
Guo D, Pellizzoni R (2017) A requests bundling DRAM controller for mixed-criticality systems. In: Proceedings of the 23rd IEEE real-time and embedded technology and applications symposium, pp 247–258
Hassan M, Patel H (2016) Criticality- and requirement-aware bus arbitration for multi-core mixed criticality systems. In: Proceedings of the 22nd IEEE real-time and embedded technology and applications symposium, pp 1–11
Hassan M, Patel H, Pellizzoni R (2015) A framework for scheduling DRAM memory accesses for multi-core mixed-time critical systems. In: Proceedings of the 21st IEEE real-time and embedded technology and applications symposium, pp 307–316
Herman J, Kenna C, Mollison M, Anderson J, Johnson D (2012) RTOS support for multicore mixed-criticality systems. In: Proceedings of the 18th IEEE real-time and embedded technology and applications symposium, pp 197–208
Herter J, Backes P, Haupenthal F, Reineke J (2011) CAMA: a predictable cache-aware memory allocator. In: Proceedings of the 23rd Euromicro conference on real-time systems, pp 23–32
Huang TY, Liu JWS, Chung JY (1996a) Allowing cycle-stealing direct memory access I/O concurrent with hard-real-time programs. In: Proceedings of the 4th international conference on parallel and distributed systems, pp 422–429
Huang TY, Liu JWS, Hull D (1996b) A method for bounding the effect of DMA I/O interference on program execution time. In: Proceedings of the 17th IEEE real-time systems symposium, pp 275–285
Huang TY, Chou CC, Chen PY (2003) Bounding the execution times of DMA I/O tasks on hard-real-time embedded systems. In: Proceedings of the 9th international conference on real-time and embedded computer systems and applications, pp 499–512
Huang TY, Chou CC, Chen PY (2006) Bounding DMA interference on hard-real-time embedded systems. J Inf Sci Eng 22:1229–1247
Google Scholar
Jalle J, Quinones E, Abella J, Fossati L, Zulianello M, Cazorla P (2014) A dual-criticality memory controller (DCmc) proposal and evaluation of a space case study. In: Proceedings of the 35th IEEE real-time systems symposium, pp 207–217
Kim N (2019) Combining hardware management with mixed-criticality provisioning in multicore real-time systems. PhD thesis, University of North Carolina at Chapel Hill, Chapel Hill, NC. http://www.cs.unc.edu/~anderson/diss/namhoondiss.pdf
Kim H, Kandhalu A, Rajkumar R (2013) A coordinated approach for practical OS-level cache management in multi-core real-time systems. In: Proceedings of the 25th Euromicro conference on real-time systems, pp 80–89
Kim H, de Niz D, Andersson B, Klein M, Mutlu O, Rajkumar R (2014a) Bounding memory interference delay in COTS-based multi-core systems. In: Proceedings of the 20th IEEE real-time and embedded technology and applications symposium, pp 145–154
Kim J, Yoon M, Bradford R, Sha L (2014b) Integrated modular avionics (IMA) partition scheduling with conflict-free I/O for multicore avionics systems. In: Proceedings of the 38th IEEE annual computer, software, and applications conference, pp 321–331
Kim H, Broman D, Lee E, Zimmer M, Shrivastava A, Oh J (2015) A predictable and command-level priority-based DRAM controller for mixed-criticality systems. In: Proceedings of the 21st IEEE real-time and embedded technology and applications symposium, pp 317–326
Kim N, Chisholm M, Otterness N, Anderson J, Smith FD (2017a) Allowing shared libraries while supporting hardware isolation in multicore real-time systems. In: Proceedings of the 23rd IEEE real-time and embedded technology and applications symposium, pp 223–234
Kim N, Ward B, Chisholm M, Anderson J, Smith FD (2017b) Attacking the one-out-of-m multicore problem by combining hardware management with mixed-criticality provisioning. Real-Time Syst 53(5):709–759
Article Google Scholar
Kim N, Tang S, Otterness N, Anderson J, Smith FD, Porter D (2018) Supporting I/O and IPC via fine-grained OS isolation for mixed-criticality real-time tasks. In: Proceedings of the 26th international conference on real-time networks and systems, pp 191–201
Kim N, Tang S, Otterness N, Anderson J, Smith F, Porter D (2019) Supporting I/O and IPC via fine-grained OS isolation for mixed-criticality real-time tasks. Full version of this paper. http://www.cs.unc.edu/~anderson/papers.html
Knowlton K (1965) A fast storage allocator. Commun ACM 8(10):623–624
Article Google Scholar
Knuth D (1968) The art of computer programming. Addison-Wesley, Reading
MATH Google Scholar
Kotaba O, Nowotsch J, Paulitsch M, Petters S, Theiling H (2013) Multicore in real-time systems—temporal isolation challenges due to shared resources. In: Proceedings of the workshop on industry-driven approaches for cost-effective certification of safety-critical, mixed-criticality systems, pp 1–6
Krishnapillai Y, Wu Z, Pellizzoni R (2014) ROC: a rank-switching, open-row DRAM controller for time-predictable systems. In: Proceedings of the 26th Euromicro conference on real-time systems, pp 27–38
Liao X, Guo R, Jin H, Yue J, Tan G (2017) Enhancing the malloc system with pollution awareness for better cache performance. IEEE Trans Parallel Distrib Syst 28(3):731–745
Article Google Scholar
LITMUS$^{\text{RT}}$ Project (2018) LITMUS$^{\text{RT}}$: Linux testbed for multiprocessor scheduling in real-time systems. http://www.litmus-rt.org/
Liu L, Cui Z, Xing M, Bao Y, Chen M, Wu C (2012) A software memory partition approach for eliminating bank-level interference in multicore systems. In: Proceedings of the 21st international conference on parallel architectures and compilation techniques, pp 367–376
Mollison M, Erickson J, Anderson J, Baruah S, Scoredos J (2010) Mixed criticality real-time scheduling for multicore systems. In: Proceedings of the 7th IEEE international conference on computer and information technology, pp 1864–1871
Muench D, Paulitsch M, Herkersdorf A (2014) Temporal separation for hardware-based I/O virtualization for mixed-criticality embedded real-time systems using PCIe SR-IOV. In: Proceedings of the workshop on architecture of computing systems, pp 1–7
Muralidhara S, Subramanian L, Mutlu O, Kandemir M, Moscibroda T (2011) Reducing memory interference in multicore systems via application-aware memory channel partitioning. In: Proceedings of the 44th annual IEEE/ACM international symposium on microarchitecture, pp 374–385
Musmanno J (2003) Data intensive systems (DIS) benchmark performance summary
Pellizzoni R, Caccamo M (2010) Impact of peripheral-processor interference on WCET analysis of real-time embedded systems. IEEE Trans Comput 59(3):400–415
Article MathSciNet Google Scholar
Pellizzoni R, Bui B, Caccamo M (2008a) Coscheduling of CPU and I/O transactions in COTS-based embedded systems. In: Proceedings of the 29th IEEE real-time systems symposium, pp 221–231
Pellizzoni R, Bui B, Caccamo M, Sha L (2008b) Coscheduling of CPU and I/O transactions in COTS-based embedded systems. In: Proceedings of the 29th IEEE real-time systems symposium, pp 221–231
Pellizzoni R, Schranzhofer A, Chen J, Caccamo M, Thiele L (2010) Worst case delay analysis for memory interference in multicore systems. In: Proceedings of the design, automation test in Europe conference exhibition, pp 741–746
Scolari A, Bartolini DB, Santambrogio MD (2016) A software cache partitioning system for hash-based caches. ACM Trans Arch Code Optim 13(4):1–24
Article Google Scholar
Seetanadi G, Camara J, Almeida L, Arzen K, Maggio M (2017) Event-driven bandwidth allocation with formal guarantees for camera networks. In: Proceedings of the 38th IEEE real-time systems symposium, pp 243–254
Tabish R, Mancuso R, Wasly S, Alhammad A, Phatak S, Pellizzoni R, Caccamo M (2016a) A real-time scratchpad-centric OS for multi-core embedded systems. In: Proceedings of the 22nd IEEE real-time and embedded technology and applications symposium, pp 1–11
Tabish R, Mancuso R, Wasly S, Alhammad A, Phatak S, Pellizzoni R, Caccamo M (2016b) A real-time scratchpad-centric OS for multi-core embedded systems. In: Proceedings of the 22nd IEEE real-time and embedded technology and applications symposium, pp 1–11
Valsan P, Yun H, Farshchi F (2016) Taming non-blocking caches to improve isolation in multicore real-time systems. In: Proceedings of the 22nd IEEE real-time and embedded technology and applications symposium, pp 1–12
Vestal S (2007) Preemptive scheduling of multi-criticality systems with varying degrees of execution time assurance. In: Proceedings of the 28th IEEE international real-time systems symposium, pp 239–243
Ward B, Herman J, Kenna C, Anderson J (2013) Making shared caches more predictable on multicore platforms. In: Proceedings of the 25th Euromicro conference on real-time systems, pp 157–167
Xu M, Phan LTX, Choi HY, Lee I (2016) Analysis and implementation of global preemptive fixed-priority scheduling with dynamic cache allocation. In: Proceedings of the 22nd IEEE real-time and embedded technology and applications symposium, pp 1–12
Xu M, Phan LTX, Choi HY, Lee I (2017) vCAT: dynamic cache management using CAT virtualization. In: Proceedings of the 23rd IEEE real-time and embedded technology and applications symposium, pp 211–222
Yun H, Yao G, Pellizzoni R, Caccamo M, Sha L (2012) Memory access control in multiprocessor for real-time systems with mixed criticality. In: Proceedings of the 24th Euromicro conference on real-time systems, pp 299–308
Yun H, Mancuso R, Wu Z, Pellizzoni R (2014) PALLOC: DRAM bank-aware memory allocator for performance isolation on multicore platforms. In: Proceedings of the 20th IEEE real-time and embedded technology and applications symposium, pp 155–166

Download references

Acknowledgements

Work supported by NSF Grants CNS 1409175, CPS 1446631, CNS 1563845, CNS 1717589, ARO Grant W911NF-17-1-0294, ONR Grant N00014-20-1-2698, and funding from General Motors.

Author information

Authors and Affiliations

Department of Computer Science, University of North Carolina at Chapel Hill, 201 S. Columbia St., Chapel Hill, NC, 27599, USA
Namhoon Kim, Stephen Tang, Nathan Otterness, James H. Anderson, F. Donelson Smith & Donald E. Porter

Authors

Namhoon Kim
View author publications
You can also search for this author in PubMed Google Scholar
Stephen Tang
View author publications
You can also search for this author in PubMed Google Scholar
Nathan Otterness
View author publications
You can also search for this author in PubMed Google Scholar
James H. Anderson
View author publications
You can also search for this author in PubMed Google Scholar
F. Donelson Smith
View author publications
You can also search for this author in PubMed Google Scholar
Donald E. Porter
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Stephen Tang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kim, N., Tang, S., Otterness, N. et al. Supporting I/O and IPC via fine-grained OS isolation for mixed-criticality real-time tasks. Real-Time Syst 56, 349–390 (2020). https://doi.org/10.1007/s11241-020-09351-2

Download citation

Published: 29 June 2020
Issue Date: October 2020
DOI: https://doi.org/10.1007/s11241-020-09351-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Supporting I/O and IPC via fine-grained OS isolation for mixed-criticality real-time tasks

Abstract

Access this article

Similar content being viewed by others

Balancing Tracking Granularity and Parallelism in Many-Task Systems: The Horizons Approach

Analyzing the impact of various parameters on job scheduling in the Google cluster dataset

Comprehensive analysis of energy efficiency and performance of ARM and RISC-V SoCs

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Supporting I/O and IPC via fine-grained OS isolation for mixed-criticality real-time tasks

Abstract

Access this article

Similar content being viewed by others

Balancing Tracking Granularity and Parallelism in Many-Task Systems: The Horizons Approach

Analyzing the impact of various parameters on job scheduling in the Google cluster dataset

Comprehensive analysis of energy efficiency and performance of ARM and RISC-V SoCs

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation