Skip to main content

Temporal Probabilistic Concepts from Heterogeneous Data Sequences

  • Conference paper
  • First Online:
Book cover Soft-Ware 2002: Computing in an Imperfect World (Soft-Ware 2002)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2311))

  • 308 Accesses

Abstract

We consider the problem of characterisation of sequences of heterogeneous symbolic data that arise from a common underlying temporal pattern. The data, which are subject to imprecision and uncertainty, are heterogeneous with respect to classification schemes, where the class values differ between sequences. However, because the sequences relate to the same underlying concept, the mappings between values, which are not known ab initio, may be learned. Such mappings relate local ontologies, in the form of classification schemes, to a global ontology (the underlying pattern). On the basis of these mappings we use maximum likelihood techniques to handle uncertainty in the data and learn local probabilistic concepts represented by individual temporal instances of the sequences. These local concepts are then combined, thus enabling us to learn the overall temporal probabilistic concept that describes the underlying pattern. Such an approach provides an intuitive way of describing the temporal pattern while allowing us to take account of inherent uncertainty using probabilistic semantics.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bassett, D.E. Jr., Eisen, M.B., Boguski, M.S.: Gene Expression Informatics-it’s All in Your Mine. Nature genetics supplement 21 (1999) 51–55

    Article  Google Scholar 

  2. Cadez, I., Gaffney, S., Smyth, P.: A General Probabilistic Framework for Clustering Individuals. In: Proc. ACM SIGKDD (2000) 140–149

    Google Scholar 

  3. Chen, A.L.P., Tseng, F.S.C.: Evaluating Aggregate Operations over Imprecise Data. IEEE Transactions on Knowledge and Data Engineering 8 (1996) 273–284

    Article  Google Scholar 

  4. Demichiel, L.G.: Resolving Database Incompatibility: An Approach to Performing Relational Operations over Mismatched Domains. IEEE Transactions on Knowledge and Data Engineering 4 (1989) 485–493

    Article  Google Scholar 

  5. Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum Likelihood from Incomplete Data via the EM Algorithm (with discussion). J. R. Statist. Soc. B 39 (1977) 1–38

    MATH  MathSciNet  Google Scholar 

  6. Devaney, M., Ram, A.: Dynamically Adjusting Concepts to Accommodate Changing Contexts. In: Proc. ICML-96 Pre-Conference Workshop on Learning in Context-Sensitive Domains, Bari, Italy (1996)

    Google Scholar 

  7. D’haeseleer, P., Wen, X., Fuhrman, S., Somogyi, R.: Mining the Gene Expression Matrix: Inferring Gene Relationships from Large-Scale Gene Expression Data. In: Paton, R.C., Holcombe, M. (eds.): Information Processing in Cells and Tissues. Plenum Publishing (1998) 203–323

    Google Scholar 

  8. D’haeseleer, P., Liang S., Somogyi, R.: Gene Expression Data Analysis and Modelling. In: Tutorial at the Pacific Symposium on Biocomputing (1999)

    Google Scholar 

  9. Doan, A.H., Domingues, P., Levy, A.: Learning Mappings between Data Schemes. In: Proc. AAAI Workshop on Learning Statistical Models from Relational Data, AAAI’ 00, Austin, Texas, Technical Report WS00006 (2000) 1–6

    Google Scholar 

  10. Fisher, D.H.: Knowledge Acquisition via Incremental Conceptual Clustering. Machine Learning 2 (1987) 139–172

    Google Scholar 

  11. Fisher, D.: Iterative Optimisation and Simplification of Hierarchical Clusterings. Journal of AI Research 4 (1996) 147–179

    MATH  Google Scholar 

  12. Han, J., Fu, Y., Wang, W., Chiang, J., Gong, W., Koperski, K., Li, D., Lu, Y., Rajan, A., Stefanovic, N., Xia, B., Zaiane O.: DBMiner: A System for Mining Knowledge in Large Relational Databases. In: Simoudis, E., Han, J., Fayyad, U.M. (eds.): Proc. 2nd International Conference on Knowledge Discovery and Data Mining (KDD’96), Portland,Oregon (1996) 250–255

    Google Scholar 

  13. Han, J., Fu, Y.: Exploration of the Power of Attribute-oriented Induction in Data Mining. In: Fayyad, U.M., Piatetsky-Shapiro, G., Smyth, P., Uthurusay, R. (eds.): Advances in Knowledge Discovery, AAAI Press / The MIT Press (1996) 399–421

    Google Scholar 

  14. Harries, M., Horn K., Sammut, C.: Extracting Hidden Context. Machine Learning 32(2) (1998) 101–126

    MATH  Google Scholar 

  15. Harries, M., Horn, K.: Learning Stable Concepts in a Changing World. In: Antoniou, G., Ghose, A., Truczszinski, M. (eds.): Learning and Reasoning with Complex Representations. Lecture Notes in AI, Vol. 1359. Springer-Verlag (1998) 106–122

    Google Scholar 

  16. Lim, E.-P., J. Srivastava, Shekher, S.: An Evidential Reasoning Approach to Attribute Value Conflict Resolution in Database Management. IEEE Transactions on Knowledge and Data Engineering 8 (1996) 707–723

    Article  Google Scholar 

  17. Malvestuto, F.M.: The Derivation Problem for Summary Data. In: Proc. ACM-SIGMOD Conference on Management of Data, New York, ACM (1998) 87–96

    Google Scholar 

  18. Malvestuto, F.M.: A Universal-Scheme Approach to Statistical Databases containing Homogeneous Summary Tables. ACM Transactions on Database Systems 18 (1993) 678–708

    Article  Google Scholar 

  19. McClean, S.I., Scotney, B.W., Shapcott, C.M.: Aggregation of Imprecise and Uncertain Information for Knowledge Discovery in Databases. In: Proc. 4th International Conference on Knowledge Discovery in Databases (KDD’98) (1998) 269–273

    Google Scholar 

  20. McClean, S.I., Scotney, B.W., Shapcott, C.M.: Incorporating Domain Knowledge into Attribute-oriented Data Mining. Journal of Intelligent Systems 6 (2000) 535–548

    Article  Google Scholar 

  21. McClean, S.I., Scotney, B.W., Greer, K.R.C.: Clustering Heterogenous Distributed Databases. In: Kargupta, H., Ghosh, J., Kumar, V. Obradovic, Z. (eds.): Proc. KDD Workshop on Knowledge Discovery from Parallel and Distributed Databases (2000) 20–29

    Google Scholar 

  22. McClean, S.I., Scotney, B.W., Shapcott, C.M.: Aggregation of Imprecise and Uncertain Information in Databases. Accepted, IEEE Trans. Knowledge and Data Engineering (2001)

    Google Scholar 

  23. Michaels, G.S., Carr, D.B., Askenazi, M., Fuhrman S., Wen, X., Somogyi, R.: Cluster Analysis and Data Visualisation of Large-Scale Gene Expression Data. Pacific Symposium on Biocomputing 3 (1998) 42–53

    Google Scholar 

  24. Mitchell, T.: Machine Learning. New York: McGraw Hill (1997)

    MATH  Google Scholar 

  25. Scotney, B.W., McClean, S.I.: Efficient Knowledge Discovery through the Integration of Heterogeneous Data. Information and Software Technology (Special Issue on Knowledge Discovery and Data Mining) 41 (1999) 569–578

    Article  Google Scholar 

  26. Scotney B.W., McClean, S.I, Rodgers, M.C.: Optimal and Efficient Integration of Heterogeneous Summary Tables in a Distributed Database. Data and Knowledge Engineering 29 (1999) 337–350

    Article  MATH  Google Scholar 

  27. Talavera, L., Béjar, J.: Generality-based Conceptual Clustering with Probabilistic Concepts. IEEE Transactions on Pattern Analysis and Machine Intelligence 23(2) (2001) 196–206

    Article  Google Scholar 

  28. Tseng, F.S.C., Chen, A.L.P., Yang, W-P.: Answering Heterogeneous Database Queries with Degrees of Uncertainty. Distributed and Parallel Databases 1 (1993) 281–302

    Article  Google Scholar 

  29. Vardi, Y., Lee, D.: From Image Deblurring to Optimal Investments: Maximum Likelihood Solutions for Positive Linear Inverse Problems (with discussion). J. R. Statist. Soc. B (1993) 569–612

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2002 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

McClean, S., Scotney, B., Palmer, F. (2002). Temporal Probabilistic Concepts from Heterogeneous Data Sequences. In: Bustard, D., Liu, W., Sterritt, R. (eds) Soft-Ware 2002: Computing in an Imperfect World. Soft-Ware 2002. Lecture Notes in Computer Science, vol 2311. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-46019-5_15

Download citation

  • DOI: https://doi.org/10.1007/3-540-46019-5_15

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-43481-8

  • Online ISBN: 978-3-540-46019-0

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics