Data Mining in Scientific Data
Knowledge discovery in scientific data, i.e. the extraction of engineering knowledge in form of a mathematical model description from experimental data, is currently an important part in the industrial re-engineering effort for an improved knowledge reuse. Despite the fact that large collections of data have been acquired in expensive investigations from numerical simulations and experiments in the past, the systematic use of data mining algorithms for the purpose of knowledge extraction from data is still in its infancy.
In contrary to other data sets collected in business and finance, scientific data possess additional properties special to their domain of origin. First, the principle of cause and effect has a strong impact and implies the completeness of the parameter list of the unknown functional model more rigorous than one would assume in other domains, such as in financial credit-worthiness data or client behavior analyses. Secondly, scientific data are usually rich in physical unit information which represents an important piece of structural knowledge in the underlying model formation theory in form of dimensionally homogeneous functions.
Based on these features of scientific data, a similarity transformation using the measurement unit information of the data can be performed. This similarity transformation eliminates the scale-dependency of the numerical data values and creates a set of dimensionless similarity numbers. Together with reasoning strategies from artificial intelligence such as case-based reasoning, these similarity number may be used to estimate many engineering properties of the technical object or process under consideration. Furthermore, the employed similarity transformation usually reduces the remaining complexity of the resulting unknown similarity function which can be approximated using different techniques.
KeywordsData Mining Scientific Data Similarity Transformation Data Mining Algorithm Dimensionless Group
Unable to display preview. Download preview PDF.
- Aamodt, A., and E. Plaza, “Case-based reasoning: Foundational issues, methodological variations, and system approaches”, in AI Communications, 7(1): 39–59, 1994.Google Scholar
- Chatterjee, N., and J. A. Campbell, “Interpolation as a means of fast adaptation in case-based problem solving”, in Proceedings Fifth German Workshop on Case-Based Reasoning, pp. 65–74, 1997.Google Scholar
- Fayyad, U. M., G. Piatetsky-Shapiro, and P. Smyth, “From data mining to knowledge discovery: An overview”, in Advances in Knowledge Discovery and Data Mining, pp. 1–34, Menlo Park: AAAUMIT Press, 1996.Google Scholar
- Hertkorn, P., and S. Rudolph, “Dimensional analysis in case-based reasoning”, in Proceedings International Workshop on Similarity Methods, pp. 163–178, Stuttgart: Insitut für Statik und Dynamik der Luft-und Raumfahrtkonstruktionen, 1998.Google Scholar
- Hertkorn, P., and S. Rudolph, “Exploiting similarity theory for case-based reasoning in real-valued engineering design problems”, in Proceedings Artificial Intelligence in Design ‘88, pp. 345–362, Dordrecht: Kluwer, 1998.Google Scholar
- Hertkorn, P., and S. Rudolph, “A systematic method to identify patterns in engineering data”, in Data Mining and Knowledge Discovery: Theory, Tools, and Technology II, pp. 273280, 2000.Google Scholar
- Holman, J., Heat Transfer. New York: McGraw-Hill, 1986.Google Scholar
- Maher, M. L, M. B. Balachandran, and D. M. Zhang, Case-Based Reasoning in Design. Mahwah: Lawrence Erlbaum,, 1995.Google Scholar
- Rudolph, S., “Eine Methodik zur systematischen Bewertung von Konstruktionen”, Düsseldorf: VDI-Verlag, 1995.Google Scholar
- Shapiro, S., Encyclopedia of Artificial Intelligence. New York, Wiley, 1987.Google Scholar
- Slade, S., “Case-based reasoning”, AI Magazine, 91 (1): 42–55, 1991.Google Scholar
- Till, M., and S. Rudolph, “A discussion of similarity concepts for acoustics based upon dimensional analysis”, in Proceedings 2nd International Workshop on Similarity Methods, pp. 181–195, 1999.Google Scholar
- Weß, S., Fallbasiertes Problemlösen in wissensbasierten Systemen zur Entscheidungsunterstützung und Diagnostik. Grundlagen, Systeme und Anwendungen. Kaiserslautern: Universität Kaiserslautern, 1995.Google Scholar