Abstract
Dynamically reconfigurable FPGAs are being deployed for real time systems that not only adapt to run-time changes in the system load but also reconfigure their resources to minimize the adverse impact of faults. In this paper, we propose a systematic methodology for conducting a Design Space Exploration (DSE) of high availability mission-critical real time systems. Armed with abundant reconfigurable resources on the FPGA estate, the central challenge in this system design problem is to endow each task with the right degree of fault tolerance so as to sustain the most important services throughout the mission’s life.
Our scheme employs a two-stage strategy to tackle faults online: A suite of passive online Fault Tolerant (FT) techniques provide immediate mitigation when faults strike in order to sustain the required functionality. Next, a pre-planned fault diagnosis and repair procedure analyzes the faulty modules offline to localize them and, if required utilizes spare resources to recover from hard faults. The repaired modules are re-engaged to restore the original FT configurations. We employ a Genetic Algorithm(GA) to evolve a population of chromosomes representing a set of FT architectures for the given application. Experiments conducted on large representative task graphs reveal that the DSE system is able to steer the population towards high availability architectural solutions with potential tradeoff between area usage and availability.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
White, D.: Considering surrounding single error events in ASICs, FPGAs and processors, Xilinx WP 402, v1.0 (September 12, 2011)
Parris, M.G., Sharma, C.A.: Progress in automatic fault recovery in Field Programmable Gate Arrays. ACM Computing Surveys V(N), Article A (April 2010)
Mitra, S., Huang, W.J., Saxena, N.R., Yu, S., McCluskey, E.J.: Reconfigurable computing for autonomous self repair. IEEE Design and Test of Computer 21(3), 228–240 (2004)
Lach, J., Mangione-Smith, W.H., Potkonjak, M.: Efficiently supporting fault-tolerance in FPGAs. In: Proceedings of the 1998 ACM/SIGDA Sixth International Symposium on Field Programmable Gate Arrays (FPGA 1998). ACM, New York (1998)
Emmert, J.M., Stroud, C.E., Abromovici, M.: Online Fault Tolerance for FPGA Logic Blocks. IEEE Trans. on Very Large Scale Integration (VLSI) Systems, 216–226 (February 2007)
Al-Haddad, R., Oreifej, R., Ashraf, A., DeMara, R.F.: Sustainable Modular Adaptive Redundancy Technique Emphasizing Partial Reconfiguration for Reduced Power Consumption. International Journal of Reconfigurable Computing 2011, Article ID 430808, 25 pages (2011), doi:10.1155/2011/430808
Iturbe, X., Benkrid, K., et al.: Enabling FPGAs for future deep space exploration missions: improving fault tolerance and computational density with R3TOS. In: Proc. 2011 NASA/ESA Conf. on Adaptive Hardware Systems-AHS 2011, pp. 104–112 (2011)
Kharchenku, V.S., Tarasenko, V.V.: The multiversion design technology of an onboard fault tolerant fPGA devices. In: 2001 MAPLD. Johns Hopkins University, Laurel (2001)
Shanthi, A.P., Parthasarathi, R.: Exploring FPGA structures for evolving fault tolerant hardware. In: Proc. of the 2003 NASA/Dod Conf. on Evolvable Hardware. IEEE (2003) ISBN 0-7695-1977-6/03
Pratt, B., Caffrey, M., et al.: TMR with more frequent voting for improved FPGA reliability. In: Proc. of International Conference on Engineering of Reconfigurable Systems and Algorithms (July 2008)
Nace, W., Coopman, P.: A Graceful Degradation Framework for Distributed Embedded Systems. Research showcase. Carnegie Mellon Univ., Inst. for Software Research, http://repository.cmu.edu/isr/668/
Wang, Y.C.: Virtex 5QP External configuration management. Xilinx application note XAPP588, v1.0 (January 19, 2012)
Goldberg, D.E.: Genetic Algorithms in Search, Optimization, and Machine Learning. Addison-Wesley, Reading (1989)
Yang, J., Yang, J.: Intelligence Optimization Algorithms: A Survey. International Journal of Advancements in Computing Technology 3(4) (May 2011)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Chakraverty, S., Agarwal, A., Agarwal, A., Kumar, A., Sikri, A. (2012). Design Space Exploration for High Availability drFPGA Based Embedded Systems. In: Hassanien, A.E., Salem, AB.M., Ramadan, R., Kim, Th. (eds) Advanced Machine Learning Technologies and Applications. AMLTA 2012. Communications in Computer and Information Science, vol 322. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35326-0_24
Download citation
DOI: https://doi.org/10.1007/978-3-642-35326-0_24
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-35325-3
Online ISBN: 978-3-642-35326-0
eBook Packages: Computer ScienceComputer Science (R0)