Improving Individual Risk Management Decisions: Learning from Experience and Coping with Model Uncertainty

Cox, Louis Anthony

doi:10.1007/978-1-4614-6058-9_2

Louis Anthony Cox Jr.²

Part of the book series: International Series in Operations Research & Management Science ((ISOR,volume 185))

2175 Accesses

Abstract

Chapter 1 argued that causal modeling allows risk managers to predict the probable consequences of alternative actions, thereby supporting rational (consequence-driven) deliberation and decision-making. This is practical when enough knowledge and data are available to create and validate causal models, using technical methods such as influence diagrams or simulation models, or more black-box statistical methods such as Granger causality testing and intervention analysis. But what should a decision-maker do when not enough is known to construct a reliable causal model? How can risk analysts help to improve policy and decision-making when the correct probabilistic causal relation between alternative acts and their probable consequences is unknown? This is the challenge of risk management with model uncertainty. It drives technical debates and policy clashes in problems from preparing for climate change, to managing emerging diseases, to operating complex and hazardous facilities safely.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Alagoz O, Hsu H, Schaefer AJ, Roberts MS (2010) Markov decision processes: a tool for sequential decision making under uncertainty. Med Decis Making 30(4):474–483
Article Google Scholar
Balaji PG, German X, Srinivasan D (2010) Urban traffic signal control using reinforcement learning agents. Intell Trans Syst IET 4(3):177–188
Article Google Scholar
Ben-Haim Y (2001) Information-gap decision theory. Academic, San Diego
Google Scholar
Ben-Tal A, El Ghaoui L, Nemirovski A (2009) Robust optimization. Princeton University Press, Princeton, NJ
Google Scholar
Ben-Tal A, Bertsimas D, Brown DB (2010) A soft robust model for optimization under ambiguity. Oper Res 58(4):1220–1234, Part 2 of 2
Article Google Scholar
Bertsimas D, Brown DB (2009) Constructing uncertainty sets for robust linear optimization. Oper Res 57(6):1483–1495
Article Google Scholar
Bertsimas D, Brown DB, Caramanis C (2011) Theory and applications of robust optimization. SIAM Rev 53(3):464–501
Article Google Scholar
Blum A, Mansour Y (2007) From external to internal regret. J Mach Learn Res 8:1307–1324
Google Scholar
Bolton RJ, Hand DJ (1999) Statistical fraud detection: a review. Stat Sci 17(3):235–255
Google Scholar
Bryant B, Lempert RJ (2010) Thinking inside the box: a participatory, computer assisted approach to scenario discovery. Technol Forecast Soc Change 77(1):34–49
Article Google Scholar
Buckley JJ (1986) Stochastic dominance: an approach to decision making under risk. Risk Anal 6(1):35–41
Article Google Scholar
Burton R (2008) On being certain: believing you are right even when you’re not. St. Martin’s Press, New York
Google Scholar
Busoniu L, Babuska R, Schutter BD (2008) A comprehensive survey of multiagent reinforcement learning. IEEE Trans Syst Man Cyb-Part C: Appl Rev 38(2):156–172, www.sciweavers.org/publications/comprehensive-survey-multiagent-reinforcement-learning
Article Google Scholar
Cai C, Liao X, Cari L (2009) Learning to explore and exploit in POMDPs. In: The conference on advances in neural information processing systems, vol 22, pp 198–206. http://people.ee.duke.edu/∼lcarin/LearnE2_NIPS09_22_FINAL.pdf
Carpenter TE, O’Brien JM, Hagerman AD, McCarl BA (2011) Epidemic and economic impacts of delayed detection of foot-and-mouth disease: a case study of a simulated outbreak in California. J Vet Diagn Invest 23(1):26–33, http://www.ncbi.nlm.nih.gov/pubmed/21217024
Article Google Scholar
Cesa-Bianchi N, Lugosi G (2006) Prediction, learning, and games. Cambridge University Press, New York, New York
Book Google Scholar
Chades I, Bouteiller B (2005) Solving multiagent Markov decision processes: a forest management example. In: MODSIM 2005 international congress on modelling and simulation
Google Scholar
Chen Y, Chen Y (2009) Combining incremental hidden markov model and adaboost algorithm for anomaly intrusion detection. In: Proceedings of the ACM SIGKDD workshop on cybersecurity and intelligence informatics, Paris, June 28–28. Chen H, Dacier M, Moens M, Paass G, Yang CC (eds) CSI-KDD ‘09. ACM, New York, pp 3–9. DOI= http://doi.acm.org/10.1145/1599272.1599276
Churchman CW (1967) Wicked problems. Manage Sci 14(4):B141–B142
Google Scholar
Condorcet NC de (1785) Essai sur l’Application de l’Analyse a la Probabilite des Decisions Rendues a la Pluralite des voix, Paris
Google Scholar
Cortés EA, Gámez M, Rubio NG (2007) Multiclass corporate failure prediction by Adaboost.M1. Int Adv Econ Res 13(3):301–312
Article Google Scholar
Dalamagkidis D, Kolokotsa D, Kalaitzakis K, Stavrakakis GS (2007) Reinforcement learning for energy conservation and comfort in buildings. Build Environ 42:2686–2698, http://www.tuc.gr/fileadmin/users_data/elci/Kalaitzakis/J.38.pdf
Article Google Scholar
Das TK, Savachkin AA, Zhu Y (2007) A large scale simulation model of pandemic influenza outbreaks for development of dynamic mitigation strategies. IIE Trans 40(9):893–905, http://www.eng.usf.edu/∼das/papers/das_r1.pdf
Article Google Scholar
Dickens L, Broda K, Russo A (2010) The dynamics of multi-agent reinforcement learning. In: Coelho H, Studer R, Wooldridge M (eds) Frontiers in artificial intelligence and applications, vol 215. Proceedings of the 2010 conference on ECAI 2010: 19th European conference on artificial intelligence. http://www.doc.ic.ac.uk/∼lwd03/ecai2010.pdf
Ermon S, Conrad J, Gomes C, Seman B (2011) Risk-sensitive policies for sustainable renewable resource allocation. In: Proceedings of 22nd international joint conference on artificial intelligence (IJCAI), Barcelona
Google Scholar
Ernst D, Stan G-B, Gongalves J, Wehenkel L (2006) Clinical data based optimal STI strategies for HIV: a reinforcement learning approach 45th IEEE conference on decision and control, San Diego, 13–15 Dec, pp 667–672. http://www.montefiore.ulg.ac.be/∼stan/CDC_2006.pdf
Fan W, Stolfo S, Zhang J, Chan P (1999) Adacost: misclassification cost-sensitive boosting. In: Proceedings of 16th international conference on machine learning, Bled, pp 97–105
Google Scholar
Fiege J, McCurdy B, Potrebko P, Champion H, Cull A (2011) PARETO: a novel evolutionary optimization approach to multiobjective IMRTs planning. Med Phys 38(9):5217–5229
Article Google Scholar
Forsell, Garcia F, Sabbadin R (2009) Reinforcement learning for spatial processes. In: Proceedings of the world IMACS/MODSIM congress, Cairns, 13–17 July 2009. http://www.mssanz.org.au/modsim09/C1/forsell.pdf
Fredriksson A, Forsgren A, Hårdemark B (2011) Minimax optimization for handling range and setup uncertainties in proton therapy. Med Phys 38(3):1672–1684 Fu M (2002) Optimization for simulation: Theory vs. practice. INFORMS Journal on Computing 14(3):192–215
Article Google Scholar
Gardner D (2009) The science of fear: how the culture of fear manipulates your brain. Penguin Group, New York
Google Scholar
Ge L, Mourits MC, Kristensen AR, Huirne RB (2010) A modelling approach to support dynamic decision-making in the control of FMD epidemics. Prev Vet Med 95(3–4):167–74, July 1. http://www.ncbi.nlm.nih.gov/pubmed/20471708s
Google Scholar
Geibel P, Wysotzk F (2005) Risk-sensitive reinforcement learning applied to control under constraint. J Artif Intell Res 24:81–108
Google Scholar
Gilboa I, Schmeidler D (1989) Maxmin expected utility with a non-unique prior. J Math Econ 18:141–153
Article Google Scholar
Green CS, Benson C, Kersten D, Schrater P (2010) Alterations in choice behavior by manipulations of world model. Proc Natl Acad Sci U S A 107(37):16401–16406
Article Google Scholar
Gregoire PL, Desjardins C, Laumonier J, Chaib-draa B (2007) Urban traffic control based on learning agents. In: Intelligent transportation systems conference. ITSC 2007 IEEE: 916–921, Seattle, Print ISBN: 978-1-4244-1396-6, doi: 10.1109/ITSC.2007.4357719
Google Scholar
Hansen LP, Sargent TJ (2001) Robust control and model uncertainty. Am Econ Rev 91:60–66
Article Google Scholar
Hansen LP, Sargent TJ (2008) Robustness. Princeton University Press, Princeton
Google Scholar
Harford T (2011) Adapt: why success always starts with failure. Farra, Straus and Giroux, New York
Google Scholar
Hauskrecht M, Fraser H (2000) Planning treatment of ischemic heart disease with partially observable Markov decision processes. Artif Intell Med 18(3):221–244. www.ncbi.nlm.nih.gov/pubmed/10675716, http://veryoldwww.cs.pitt.edu/∼milos/research/AIMJ-2000.pdf
Hazen E, Seshadhri C (2007) Efficient learning algorithms for changing environments. In: ICML ‘09 proceedings of the 26th annual international conference on machine learning, New York. http://ie.technion.ac.il/∼ehazan/papers/adap-icml2009.pdf
Hoeting JA, Madigan D, Raftery AE, Volinsky CT (1999) Bayesian model averaging: a tutorial. Stat Sci 14(4):382–401, http://mpdc.mae.cornell.edu/Courses/UQ/2676803.pdf
Article Google Scholar
Hrdlicka J, Klema J (2011) Schizophrenia prediction with the adaboost algorithm. Stud Health Technol Inform 169:574–578
Google Scholar
Hu W, Hu W, Maybank S (2008) AdaBoost-based algorithm for network intrusion detection. IEEE Trans Syst Man Cybern B Cybern 38(2):577–583
Article Google Scholar
Hutter M, Poland J (2005) Adaptive online prediction by following the perturbed leader. J Mach Learn Res 6:639–660, http://jmlr.csail.mit.edu/papers/volume6/hutter05a/hutter05a.pdf
Google Scholar
Inaniwa T, Kanematsu N, Furukawa T, Hasegawa A (2011) A robust algorithm of intensity modulated proton therapy for critical tissue sparing and target coverage. Phys Med Biol 56(15):4749–4770. http://www.ncbi.nlm.nih.gov/pubmed/21753233
Itoh H, Nakamura K (2007) Partially observable Markov decision processes with imprecise parameters. Artif Intell 171(8–9):453–490
Article Google Scholar
Izadi MT, Buckeridge DL (2007). Optimizing anthrax outbreak detection using reinforcement learning. In: IAAI’07 proceedings of the 19th national conference on Innovative applications of artificial intelligence – Volume 2, AAAI Press, Vancouver, http://www.aaai.org/Papers/AAAI/2007/AAAI07-286.pdf
Jafari A, Greenwald A, Gondek D, Ercal G (2001) On no-regret learning, fictitious play, and Nash equilibrium. In: Proceedings of the eighteenth international conference on machine learning, Morgan Kaufmann, San Francisco, pp 226–233. www.cs.brown.edu/∼amy/papers/icml.pdf
Jaksch T, Ortner R, Auer P (2010) Near-optimal regret bounds for reinforcement learning. J Mach Learn Res 11:1563–1600
Google Scholar
Jung J, Liu CC, Tanimoto S, Vittal V (2002) Adaptation in load shedding under vulnerable operating conditions. IEEE Trans Power Syst 17:1199–1205
Article Google Scholar
Kahneman D (2011) Thinking fast and slow. Farrar, Straus, and Giroux, New York
Google Scholar
Kahnt T, Park SQ, Cohen MX, Beck A, Heinz A, Wrase J (2009) Dorsal striatal-midbrain connectivity in humans predicts how reinforcements are used to guide decisions. J Cogn Neurosci 21(7):1332–1345
Article Google Scholar
Kaplan S, Garrick BJ (1981) On the quantitative definition of risk. Risk Anal 1(1):11–27, http://josiah.berkeley.edu/2007Fall/NE275/CourseReader/3.pdf
Article Google Scholar
Koop G, Tole L (2004) Measuring the health effects of air pollution: to what extent can we really say that people are dying from bad air? J Environ Econ Manag 47:30–54, http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.164.6048
Article Google Scholar
Kuyer L, Whiteson S, Bakker B, Vlassis N (2008) Multiagent reinforcement learning for urban traffic control using coordination graphs. In: ECML 2008: proceedings of the nineteenth European conference on machine learning, Perth, pp 656–671
Google Scholar
Laeven R, Stadje MA (2011) Entropy coherent and entropy convex measures of risk. Tilburg University CentER discussion paper 2011–2031. http://arno.uvt.nl/show.cgi?fid=114115
Lee EK, Chen CH, Pietz F, Benecke B (2010) Disease propagation analysis and mitigation strategies for effective mass dispensing. In: AMIA annual symposium proceedings, pp 427–431, published online on 13 Nov 2010. http://www.ncbi.nlm.nih.gov/pubmed/21347014
Lempert RJ, Collins MT (2007) Managing the risk of uncertain threshold response: comparison of robust, optimum, and precautionary approaches. Risk Anal 27(4):1009–1026
Article Google Scholar
Lempert R, Kalra N (2008) Managing climate risks in developing countries with robust decision making. World resources report, Washington, DC. http://www.worldresourcesreport.org/files/wrr/papers/wrr_lempert_and_kalra_uncertainty.pdf
Lizotte DJ, Gunter L, Laber E, Murphy SA (2008) Missing data and uncertainty in batch reinforcement learning, NIPS-08 workshop on model uncertainty and risk in RL. http://www.cs.uwaterloo.ca/∼ppoupart/nips08-workshop/nips08-workshop-schedule.html
Lu F, Boritz JE, Covvey HD (2006) Adaptive fraud detection using Benford’s law. In: Advances in artificial intelligence: 19th conference of the Canadian society for computational studies of intelligence. Québec City. http://bit.csc.lsu.edu/∼jianhua/petrov.pdf
Maccheroni F, Marinacci M, Rustichini A (2006) Ambiguity aversion, robustness, and the variational representation of preferences. Econometrica 74:1447–1498
Article Google Scholar
Makridakis S, Hibon M (2000) The M3-competition: results, conclusions and implications. Int J Forecast 16:451–476, http://www.forecastingprinciples.com/files/pdf/Makridakia-The%20M3%20Competition.pdf
Article Google Scholar
Marchau VAWJ, Walker WE, van Wee GP (2010) Dynamic adaptive transport policies for handling deep uncertainty. Technol Forecast Soc Change 77(6):940–950
Article Google Scholar
Masnadi-Shirazi H, Vasconcelos N (2007) Asymmetric boosting. In: Proceedings 24th international conference on machine learning, New York, pp 609–619
Google Scholar
McDonald-Madden E, Chadès I, McCarthy MA, Linkie M, Possingham HP (2011) Allocating conservation resources between areas where persistence of a species is uncertain. Ecol Appl 21(3):844–858, http://www.ncbi.nlm.nih.gov/pubmed/21639049
Article Google Scholar
Molinaro AM, Simon R, Pfeiffer RM (2005) Prediction error estimation: a comparison of resampling methods. Bioinformatics 21(15):3301–3307
Article Google Scholar
Morra JH, Tu Z, Apostolova LG, Green AE, Toga AW, Thompson PM (2010) Comparison of AdaBoost and support vector machines for detecting Alzheimer’s disease through automated hippocampal segmentation. IEEE Trans Med Imaging 29(1):30–43
Article Google Scholar
Ni Y, Liu Z-Q (2008) Bounded-parameter partially observable Markov decision processes. In: Proceedings of the eighteenth international conference on automated planning and scheduling, Sydney
Google Scholar
Niua B, Jinb Y, Lua WC, Li GZ (2009) Predicting toxic action mechanisms of phenols using AdaBoost learner. Chemometr Intell Lab Syst 96(1):43–48
Article Google Scholar
Osada H, Fujita S (2005) CHQ: a multi-agent reinforcement learning scheme for partially observable Markov decision processes. IEICE – Trans Inf Syst E88-D(5):1004–1011
Article Google Scholar
Perkins TJ, Barto AG (2002) Lyapunov design for safe reinforcement learning. J Mach Learn Res 3:803–883, http://jmlr.csail.mit.edu/papers/volume3/perkins02a/perkins02a.pdf
Google Scholar
Regan K, Boutilier C (2008) Regret-based reward elicitation for Markov decision processes. NIPS-08 workshop on model uncertainty and risk in RL. http://www.cs.uwaterloo.ca/∼ppoupart/nips08-workshop/nips08-workshop-schedule.html
Rittel H, Webber M (1973). Dilemmas in a general theory of planning. Policy Sci (4):155–169. [Reprinted in Cross N (ed) (1984) Developments in design methodology. Wiley, Chichester, pp 135–144]. http://www.uctc.net/mwebber/Rittel+Webber+Dilemmas+General_Theory_of_Planning.pdf
Ross S, Pineau J, Chaib-draa B, Kreitmann P (2011) POMDPs: a new perspective on the explore-exploit tradeoff in partially observable domains. J Mach Learn Res 12:1729–1770
Google Scholar
Sabbadin R, Spring D, Bergonnier E (2007) A Reinforcement-learning application to biodiversity conservation in costa-rican forest. In: 17th Inter. Congress on Modelling and Simulation (MODSIM’07). http://www.mssanz.org.au/MODSIM07/papers/41_s34/AReinforcement_s34_Sabbadin_.pdf Savio A, García-Sebastián M, Graña M, Villanúa J (2009) Results of an Adaboost approach on Alzheimer’s disease detection on MRI. Bioinspired applications in artificial and natural computation lecture notes in computer science, vol 5602, pp 114–123. www.ehu.es/ccwintco/uploads/1/11/GarciaSebastianSavio-VBM_SPM_SVM-IWINAC2009_v2.pdf
Schaefer AJ, Bailey MD, Shechter SM, Roberts MS (2004) Handbook of operations research/management science applications in health care, Modeling medical treatment using Markov decision processes. Kluwer, Boston, pp 593–612, http://www.ie.pitt.edu/∼schaefer/Papers/MDPMedTreatment.pdf
Google Scholar
Schönberg T, Daw ND, Joel D, O’Doherty JP (2007) Reinforcement learning signals in the human striatum distinguish learners from nonlearners during reward-based decision making. J Neurosci 27(47):12860–12867
Article Google Scholar
Smith JE, von Winterfeldt D (2004) Decision analysis in “management science”. Manag Sci 50(5):561–574
Article Google Scholar
Srinivasan J, Gadgil S (2002) Asian brown cloud – fact and fantasy. Curr Sci 83:586–592
Google Scholar
Su Q, Lu W, Niu B, Liu X (2011) Classification of the toxicity of some organic compounds to tadpoles (Rana Temporaria) through integrating multiple classifiers. Mol Inform 30(8):672–675
Google Scholar
Sutton RS, Barto AG (2005) Reinforcement learning: an introduction, MIT Press. Cambridge, MA. http://rlai.cs.ualberta.ca/∼sutton/book/ebook/the-book.html
Svetnik V, Wang T, Tong C, Liaw A, Sheridan RP, Song Q (2005) Boosting: an ensemble learning tool for compound classification and QSAR modeling. J Chem Inf Model 45(3):786–799, http://www.ncbi.nlm.nih.gov/pubmed/15921468
Article Google Scholar
Szepesvari C (2010) Reinforcement learning algorithms, Morgan & Claypool Publishers. http://books.google.com/books?id=qwtphfl7U74C&printsec=frontcover&source=gbs_ge_summary_r&cad=0#v=onepage&q&f=false
Tan C, Chen H, Xia C (2009) Early prediction of lung cancer based on the combination of trace element analysis in urine and an Adaboost algorithm. J Pharm Biomed Anal 49(3):746–752
Article Google Scholar
Walker WE, Marchau VAWJ, Swanson D (2010) Addressing deep uncertainty using adaptive policies introduction to section 2. Technol Forecast Soc Change 77(6):917–923
Article Google Scholar
Waltman L, van Eck NJ (2009) Robust evolutionary algorithm design for socio-economic simulation: some comments. Comput Econ 33:103–105, http://repub.eur.nl/res/pub/18660/RobustEvolutionary_2008.pdf
Article Google Scholar
Wang X, Sandholm T (2002) Reinforcement learning to play an optimal Nash equilibrium in team Markov games. In: Proceedings of the annual conference on neural information processing systems (NIPS), Vancouver. http://books.nips.cc/papers/files/nips15/CN08.pdf
Wang Y, Xie Q, Ammari A (2011) Deriving a near-optimal power management policy using model-free reinforcement learning and Bayesian classification. In: DAC ‘11 proceedings of the 48th design automation conference, ACM, New York
Google Scholar
Weick KE, Sutcliffe KM (2007) Managing the unexpected: resilient performance in an age of uncertainty, 2nd edn. Hoboken, New Jersey
Google Scholar
Xu X, Sun Y, Huang Z (2007) Defending DDoS attacks using hidden Markov models and cooperative reinforcement learning. In: Proceedings, PAISI’07 proceedings of the 2007 pacific Asia conference on intelligence and security informatics, Springer, Berlin/Heidelberg
Google Scholar
Ye D, Zhang M, Sutato D (2011) A hybrid multiagent framework with Q-learning for power grid systems restoration. IEEE Trans Power Syst 26(4):2434–2441
Article Google Scholar
Yousefpour R, Hanewinkel M (2009) Modelling of forest conversion planning with an adaptive simulation-optimization approach and simultaneous consideration of the values of timber, carbon and biodiversity. Ecol Econ 68(6):1711–1722
Article Google Scholar
Yu JY, Mannor S, Shimkin N (2009) Markov decision processes with arbitrary reward processes. Math Oper Res 34(3):737–757
Article Google Scholar
Zhao Y, Kosorok MR, Zeng D (2009) Reinforcement learning design for cancer clinical trials. Stat Med 28(26):3294–3315. http://www.ncbi.nlm.nih.gov/pubmed/19750510
Google Scholar
Zhou L, Lai KK (2009) Adaboosting neural networks for credit scoring. Advances in intelligent and soft computing vol 56, pp 875–884. doi: 10.1007/978-3-642-01216-7_93
Google Scholar
Zhou B, Chan KW, Yu T (2011) Q-Learning approach for hierarchical AGC Scheme of interconnected power grids. In: The proceedings of international conference on smart grid and clean energy technologies energy procedia, vol 12, Chengdu, pp 43–52
Google Scholar

Download references

Author information

Authors and Affiliations

Cox Associates, Denver, Colorado, USA
Louis Anthony Cox Jr.

Authors

Louis Anthony Cox Jr.
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Cox, L.A. (2012). Improving Individual Risk Management Decisions: Learning from Experience and Coping with Model Uncertainty. In: Improving Risk Analysis. International Series in Operations Research & Management Science, vol 185. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-6058-9_2

Download citation

DOI: https://doi.org/10.1007/978-1-4614-6058-9_2
Published: 15 November 2012
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-6057-2
Online ISBN: 978-1-4614-6058-9
eBook Packages: Business and EconomicsBusiness and Management (R0)

Publish with us

Policies and ethics