Skip to main content

An Epistemological Model for a Data Analysis Process in Support of Verification and Validation

  • Chapter
  • First Online:
Information Quality in Information Fusion and Decision Making

Abstract

The verification and validation (V&V) of the data analysis process is critical for establishing the objective correctness of an analytic workflow. Yet, problems, mechanisms, and shortfalls for verifying and validating data analysis processes have not been investigated, understood, or well defined by the data analysis community. The processes of verification and validation evaluate the correctness of a logical mechanism, either computational or cognitive. Verification establishes whether the object of the evaluation performs as it was designed to perform. (“Does it do the thing right?”) Validation establishes whether the object of the evaluation performs accurately with respect to the real world. (“Does it do the right thing?”) Computational mechanisms producing numerical or statistical results are used by human analysts to gain an understanding about the real world from which the data came. The results of the computational mechanisms motivate cognitive associations that further drive the data analysis process. The combination of computational and cognitive analytical methods into a workflow defines the data analysis process. People do not typically consider the V&V of the data analysis process. The V&V of the cognitive assumptions, reasons, and/or mechanisms that connect analytical elements must also be considered and evaluated for correctness. Data Analysis Process Verification and Validation (DAP-V&V) defines a framework and processes that may be applied to identify, structure, and associate logical elements. DAP-V&V is a way of establishing correctness of individual steps along an analytical workflow and ensuring integrity of conceptual associations that are composed into an aggregate analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 139.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 179.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    It is important to note that Google Flu Trends is no longer active having been terminated in 2015 [17].

  2. 2.

    Attempts to nail down V&V in the “soft” sciences over the decades has resulted in various assertions of types of validity that does little to clarify the space and contributed greatly to confounding the terminology regarding the study of validation in these spaces [2, 5, 13].

  3. 3.

    “Key Concepts of VV and A” Sept. 15, 2006; official DOD pp. 7–8; http://vva.msco.mil/Key/key-prd.pdf.

  4. 4.

    Ibid. p. 6.

  5. 5.

    Though the purpose for defining an epistemological hierarchy (EH) model of knowledge elements was for evaluating the verification and validation of Human, Social, Cultural, Behavioral (HSCB) models, the mechanism is applicable to any kind of inquiry-based modeling. The prerequisite for an EH decomposition of a model is a Kantian composition of knowledge elements defined as observable concepts and reasoned understanding over those concepts [8].

  6. 6.

    http://www.businessdictionary.com/definition/data-analysis.html

  7. 7.

    The R&M data consists of names of particular LRUs diagnosed as “faulty” and dates they were removed from the aircraft.

  8. 8.

    The Mission is defined as specific characteristics of how the aircraft is being flown. In this case, the Mission was induced from the data. Eventually, the Data Analytic Process to produce the Mission will require its own pair of hierarchies in order to be properly characterized, verified and validated.

  9. 9.

    Indicators may also be found in the R&M data. For instance, sequences or co-occurrences of LRU removals may be used to predict faults. Analysis for this DAP is not included in this use case.

References

  1. W. Abdullah, R. Reddy, C. Butler, W. Walters, Utilizing Bayesian belief networks to model the ocean-atmosphere interface. J Miss Acad. Sci 63(1), 121–122 (2018)

    Google Scholar 

  2. R. Adcock, D. Collier, Measurement validity: A shared standard for qualitative and quantitative research. Am. Polit. Sci. Rev. 95(03), 529–546 (2001). https://doi.org/10.2307/3118231

  3. A. Bekker, 4 types of Data Analytics to Improve Decision-Making (Science Soft, 2017), [Online]. Available: https://www.scnsoft.com/blog/4-types-of-data-analytics. Accessed 3 Dec 2018

  4. F.C. Copleston, A History of Philosophy (Image Books, Garden City, 1964)

    Google Scholar 

  5. M. Cronbach, P. Meehl, Construct validity in psychological tests. Psychol. Bull. 52(4), 281–302 (1955)

    Google Scholar 

  6. J.D. Fearon, D.D. Laitin, Ordinary Language and External Validity: Specifying Concepts in the Study of Ethnicity*. LiCEP Meetings (2000), Retrieved from https://web.stanford.edu/group/fearon-research/cgi-bin/wordpress/wp-content/uploads/2013/10/Ordinary-Language-and-External-Validity-Specifying-Concepts-in-the-Study-of-Ethnicity.pdf

  7. T. Harford, Big data: A big mistake? Significance 11(5), 14–19 (2014). https://doi.org/10.1111/j.1740-9713.2014.00778.x

  8. I. Kant, Critique of Pure Reason, 1. paperback ed., 15. print (Cambridge Univ. Press, Cambridge [u.a.], 2009)

    Google Scholar 

  9. A. Kaplan, The Conduct of Inquiry (Chandler, San Francisco, 1964)

    Google Scholar 

  10. A. Kaplan, The conduct of inquiry (Transaction Publishers, 1973). Retrieved from https://books.google.com/books?id=ks8wuZHSKs8C&pg=PA53&lpg=PA53&dq=Abraham+Kaplan%27s+paradox&source=bl&ots=bHV9ptpV3g&sig=8_k3iRGHtuBuIOvAcZSGqLwTTYo&hl=en&sa=X&ved=0ahUKEwjzvrDS777YAhVDRN8KHaxlBA4Q6AEISTAI#v=onepage&q=AbrahamKaplan’s paradox&f=fals

  11. C. Kufs, The five pursuits you meet in statistics. (Stats With Cats Blog, 2010), Retrieved May 10, 2018, from https://statswithcats.wordpress.com/2010/08/22/the-five-pursuits-you-meet-in-statistics/

  12. D. Lazer, R. Kennedy, What we can learn from the epic failure of google flu trends. (WIRED, 2015), Retrieved May 10, 2018, from https://www.wired.com/2015/10/can-learn-epic-failure-google-flu-trends/

  13. I.S. Lustick, M.R. Tubin, Verification as a form of validation: Deepening theory to broaden application of DOD protocols to the social sciences, in Proceedings of the 4th International Conference on Applied Human Factors and Ergonomics, (San Francisco, 2012). Retrieved from http://lustickconsulting.com/data/Verification as a Form of Validation - Lustick, Tubin.pdf

  14. J. Overton, Going Pro in Data Science (O’Reilly Media, Inc, 2012). Retrieved from https://www.oreilly.com/data/free/files/going-pro-in-data-science.pdf?mkt_tok=eyJpIjoiWW1GbU1XSmhNRGMwTkRVdyIsInQiOiJGNlRrSFZnZExYXC9wR0ZOZWZOaWZ1ZHFUZjBFM1RhblFJSHM4VmpibW5udVwvY2FLRVVKVFdsQzlCNnV6ZEQ3NkI3VEg3c09idlhZWU5YNEVCTlIySjM0eCtNRGJnQnpsR1Q0QTFaU

  15. J.A. Paulos, Metric Mania (The New York Times, 2010)

    Google Scholar 

  16. A. Ruvinsky, J. Wedgwood, J. Welsh, Establishing bounds of responsible operational use of social science models via innovations in verification and validation, in 2nd International Conference on Cross-Cultural Decision Making, 2012

    Google Scholar 

  17. F. Sailer, Google Flu Trends is dead – long live Google Trends? (UCL Research Department of Primary Care and Population Health Blog, 2018), Retrieved August 23, 2018, from http://blogs.ucl.ac.uk/pcph-blog/2018/01/23/google-flu-trends-is-dead-long-live-google-trends/

  18. J.D. Stemwedel, Basic concepts: Falsifiable claims. – Adventures in ethics and science. Retrieved August 24, 2018, from http://scienceblogs.com/ethicsandscience/2007/01/31/basic-concepts-falsifiable/

  19. A.G. Stephenson, D.R. Mulville, F.H. Bauer, G.A. Dukeman, P. Norvig, L.S. LaPiana, R. Sackheim, Mars Climate Orbiter Mishap Investigation Board Phase I Report (1999), Retrieved from http://sunnyday.mit.edu/accidents/MCO_report.pdf

  20. K. Vasileva, Common mistakes in data analysis – The Data Nudge – Medium. (2017), Retrieved May 10, 2018, from https://medium.com/the-data-nudge/common-mistakes-in-data-analysis-951e366084b9

  21. VV&A Recommended Practices Guide, (2011). https://vva.msco.mil/Key/key-pr.pdf

  22. Wikipedia_contributors, Data analysis. (2018), Retrieved October 5, 2018, from https://en.wikipedia.org/w/index.php?title=Data_analysis&oldid=838877371

  23. N. Yau, Why Context is as Important as the Data Itself (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alicia Ruvinsky .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Ruvinsky, A. et al. (2019). An Epistemological Model for a Data Analysis Process in Support of Verification and Validation. In: Bossé, É., Rogova, G. (eds) Information Quality in Information Fusion and Decision Making. Information Fusion and Data Science. Springer, Cham. https://doi.org/10.1007/978-3-030-03643-0_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-03643-0_16

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-03642-3

  • Online ISBN: 978-3-030-03643-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics