Bayesian Ying Yang Learning (I): A Unified Perspective for Statistical Modeling

Xu, Lei

doi:10.1007/978-3-662-07952-2_22

Lei Xu³

192 Accesses
16 Citations

Abstract

Major dependence structure mining tasks are overviewed from a general statistical learning perspective. Bayesian Ying Yang (BYY) harmony learning has been introduced as a unified framework for mining these dependence structures, with new mechanisms for model selection and regularization on a finite size of samples. Main results are summarized and bibliographic remarks are made. Two typical approaches for implementing learning, namely optimization search and accumulation consensus, are also introduced.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

H. Akaike: A new look at the statistical model identification, IEEE Tr. Automatic Control, 19, 714–723 (1974)
Google Scholar
SI. Amari, A. Cichocki, HH. Yang: A new learning algorithm for blind separation of sources. In: DS Touretzky et al. (eds.) Advances in Neural Information Processing 8, MIT Press, 757–763 (1996)
Google Scholar
TW. Anderson, H. Rubin: Statistical inference in factor analysis, Proc. Berke-ley Symp. Math. Statist. Prob. 3rd 5, UC Berkeley, 111–150 (1956)
Google Scholar
A. Bell, T. Sejnowski: An information maximization approach to blind sepa-ration and blind deconvolution, Neural Computation, 17, 1129–1159 (1995)
Article Google Scholar
CM. Bishop: Training with noise is equivalent to Tikhonov regularization, Neural Computation 7, 108–116 (1995)
Article Google Scholar
H. Bourlard, Y. Kamp: Auto-association by multilayer Perceptrons and singular value decomposition, Biol. Cyb. 59, 291–294 (1988)
Article MathSciNet MATH Google Scholar
P. Comon: Independent component analysis–a new concept? Signal Processing, 36, 287–314 (1994)
Article MATH Google Scholar
KY. Chan, WS. Chu, L. Xu: Experimental Comparison between two computational strategies for topological self-organization, Proc. of IDEAL03, Lecture Notes in Computer Science, LNCS 2690, Springer-Verlag, 410–414 (2003)
Google Scholar
AP. Dempster, NM. Laird, DB. Rubin: Maximum-likelihood from incomplete data via the EM algorithm, J. Royal Statistical Society, B39, 1–38 (1977)
MathSciNet MATH Google Scholar
PA. Devijver, J. Kittler: Pattern Recognition: A Statistical Approach, Prentice-Hall (1982)
MATH Google Scholar
RO. Duda, PE. Hart: Pattern classification and Scene analysis (Wiley, 1973 ) 22. 12 RA. Jacobs et al.: Adaptive mixtures of local experts, Neural Computation, 3, 79–87 (1991)
Google Scholar
MI. Jordan, RA. Jacobs: Hierarchical mixtures of experts and the EM algorithm, Neural Computation, 6, 181–214 (1994)
Article Google Scholar
MI. Jordan, L. Xu: Convergence results for the EM approach to mixtures of experts, Neural Networks, 8, 1409–1431 (1995)
Article Google Scholar
C. Jutten, J. Herault: Independent Component Analysis versus Principal Component Analysis, Proc. EUSIPCO88, 643–646 (1988)
Google Scholar
H. Kälviäinen, P. Hirvonen, L. Xu, E. Oja: Probabilistic and Non-probabilistic Hough Transforms: Overview and Comparisons, Image and Vision Computing, Vol. 5, No. 4, pp. 239–252 (1995)
Article Google Scholar
J. Han, M. Kamber: Data Mining: Concepts and Techniques (Morgan Kaufmann, 2001 )
Google Scholar
GE. Hinton, P. Dayan, BJ. Frey, RN. Neal: The wake-sleep algorithm for unsupervised learning neural networks, Science, 268, 1158–1160 (1995)
Article Google Scholar
H. Hotelling: Simplified calculation of principal components, Psychometrika, 1, 27–35 (1936)
Article MATH Google Scholar
P.V.C. Hough: Method and means for recognizing complex patterns, U.S. Patent 3069654 (Dec. 18, 1962 )
Google Scholar
J. Illingworth, J. Kittler: A survey of the Hough Transform, Comput. Vision Graphics and Image Process, 43, 221–238 (1988)
Article Google Scholar
FV. Jensen: An introduction to Bayesian networks (University of Collage London Press) (1996)
Google Scholar
T. Kohonen: Self-Organizing Maps ( Springer-Verlag, Berlin, 1995 )
Book Google Scholar
H. Kushner, D. Clark: Stochastic approximation methods for constrained and unconstrained systems ( New York: Springer ) (1998)
Google Scholar
HY. Kwok, CM. Chen, L. Xu: Comparison between Mixture of ARMA and Mixture of AR Model with Application to Time Series Forecasting, Proc. ICONIP 98, Oct.21–23, 1998, Kitakyushu, Japan, Vol. 2, 1049–1052
Google Scholar
ZY. Liu, KC. Chiu, L. Xu: The One-bit-Matching Conjecture for Independent Component Analysis, Neural Computation, Vol. 16, No. 2, pp. 383–399 (2003)
Article Google Scholar
ZY. Liu, KC. Chiu, L. Xu: Strip Line Detection and Thinning by RPCL- Based Local PCA, Pattern Recognition Letters, 24, pp. 2335–2344 (2003)
Article MATH Google Scholar
ZY. Liu, KC. Chiu, L. Xu: Improved system for object detection and star/galaxy classification via local subspace analysis, Neural Networks, 16, 437451 (2003)
Google Scholar
ZY. Liu, L. Xu: Smoothed Local PCA by BYY data smoothing learning, Proc ICCAS 2001, Jeju, Korea, Oct. 17–21, 2001, pp. 924–927
Google Scholar
J. Ma, T. Wang, L. Xu: A gradient BYY harmony learning rule on Gaussian mixture with automated model selection, Neurocomputing, 56, 481–487 (2004)
Article Google Scholar
Ch. von der Malsburg: Self-organization of orientation sensitive cells in the striate cortex, Kybernetik 14, 85–100 (1973)
Article Google Scholar
R. McDonald: Factor Analysis and Related Techniques (Lawrence Erlbaum)
Google Scholar
GJ. McLachlan, T. Krishnan: The EM Algorithm and Extensions, John Wiley and Son, INC (1997)
Google Scholar
E. Oja: Subspace Methods of Pattern Recognition ( Research Studies Press, UK 1983 )
Google Scholar
J. Pearl: Probabilistic reasoning in intelligent systems: networks of plausible inference ( San Francisco, CA: Morgan Kaufman 1988 )
Google Scholar
L. Rabiner, BH. Juang: Fundamentals of Speech Recognition, Prentice Hall, Inc. (1993)
Google Scholar
H. Robbins, S. Monro: A stochastic approximation method, Ann. Math. Statist., 22, 400–407 (1950)
Article MathSciNet Google Scholar
RA. Redner, HF. Walker: Mixture densities, maximum likelihood, and the EM algorithm, SIAM Review, 26, 195–239 (1984)
Article MathSciNet MATH Google Scholar
D. Rubi, D. Thayer: EM algorithm for ML factor analysis, Psychometrika, 57, 69–76 (1976)
Google Scholar
L. Saul, MI. Jordan: Exploiting tractable structures in intractable Networks, Advances in neural information processing systems, 8, MIT Press, 486–492 (1995)
Google Scholar
C. Spearman: General intelligence domainively determined and measured, Am. J. Psychol. 15, 201–293 (2004)
Article Google Scholar
A. Taleb, C. Jutten: Nonlinear source separation: The post-nonlinear Mixtures, Proc. ESANN97, 279–284 (1997)
Google Scholar
H. Tang, KC. Chiu, L. Xu: Finite Mixture of ARMA-GARCH Model For Stock Price Prediction, to appear on Proc. CIE 2003, NC, USA (Sept. 26–30, 2003 )
Google Scholar
H. Tang, L. Xu: Mixture-Of-Expert ARMA-GARCH Models For Stock Price Prediction, Proc. of ICCAS 2003, Oct.22–25, 2003 Gyeongju, KOREA, pp. 402407 (2003)
Google Scholar
ME. Tipping, CM. Bishop: Mixtures of probabilistic principal component analysis, Neural Computation, 11, 443–482 (1999)
Article Google Scholar
L. Tong, Y. Inouye, R. Liu: Waveform-preserving blind estimation of multiple independent sources, IEEE Tr on Signal Processing, 41, 2461–2470 (1993)
Article MATH Google Scholar
VN. Vapnik: The Nature Of Statistical Learning Theory (Springer-Verlag) (1995)
Google Scholar
CS. Wong, WK. Li: On a mixture autoregressive model, Journal of the Royal Statistical Society Series B, Vol. 62, No. 1, pp. 95–115 (2000)
MathSciNet MATH Google Scholar
W. Wong, F. Yip, L. Xu: Financial Prediction by Finite Mixture GARCH Model, Proc. ICONIP 98, Oct.21–23, 1998, Kitakyushu, Japan, Vol. 3, pp. 13511354 (1998)
Google Scholar
L. Xu: Temporal BYY Learning, Identifiable State Spaces, and Space Dimension Determination, IEEE Trans on Neural Networks, Special Issue on Temporal Coding for Neural Information Processing, in press (2004)
Google Scholar
L. Xu: BYY Learning, Regularized Implementation, and Model Selection on Modular Networks with One Hidden Layer of Binary Units“, Neurocomputing, Vol. 51, pp. 227–301 (2003)
Article Google Scholar
L. Xu: Data smoothing regularization, multi-sets-learning, and problem solving strategies, Neural Networks, Vol. 15, No. 56, 817–825 (2003)
Article Google Scholar
L. Xu: Independent Component Analysis and Extensions with Noise and Time: A Bayesian Ying Yang Learning Perspective, Neural Information Processing–Letters and Reviews, Vol. 1, No. 1, pp. 1–52 (2003)
Google Scholar
L. Xu: BYY Harmony Learning, Structural RPCL, and Topological Self-Organizing on Mixture Models, Neural Networks, Vol. 15, nos. 8–9, 1125–1151 (2002)
Article Google Scholar
L. Xu: Bayesian Ying Yang Harmony Learning, The Handbook of Brain Theory and Neural Networks, Second edition, (MA Arbib, Ed.), Cambridge, MA: The MIT Press, pp. 1231–1237 (2002)
Google Scholar
L. Xu: Mining Dependence Structures from Statistical Learning Perspective. In: H, Yin et al. (eds.), Proc. IDEAL2002: Lecture Notes in Computer Science, 2412, Springer-Verlag, 285–306 (2002)
Google Scholar
L. Xu: BYY Harmony Learning, Independent State Space and Generalized APT Financial Analyses, IEEE Tr on Neural Networks, 12 (4), 822–849 (2001)
Article Google Scholar
L. Xu: Best Harmony, Unified RPCL and Automated Model Selection for Unsupervised and Supervised Learning on Gaussian Mixtures, Three-Layer Nets and ME-RBF-SVM Models, Intl J of Neural Systems 11 (1), 43–69 (2001)
Google Scholar
L. Xu: Temporal BYY Learning for State Space Approach, Hidden Markov Model and Blind Source Separation“, IEEE Tr on Signal Processing 48, 21322144 (2000)
Google Scholar
L. Xu: BYY Learning System and Theory for Parameter Estimation, Data Smoothing Based Regularization and Model Selection, Neural, Parallel and Scientific Computations, Vol. 8, pp. 55–82 (2000)
MATH Google Scholar
L. Xu: Temporal Bayesian Ying Yang Dependence Reduction, Blind Source Separation and Principal Independent Components, Proc. IJCNN 99, July 1016, 1999, DC, USA, Vol. 2, pp. 1071–1076 (1999)
Google Scholar
L. Xu: Bayesian Ying Yang Unsupervised and Supervised Learning: Theory and Applications, Proc. of 1999 Chinese Conf. on Neural Networks and Signal Processing, pp. 12–29, Shantou, China (Nov. 1999)
Google Scholar
L. Xu: Bayesian Ying Yang Theory for Empirical Learning, Regularization and Model Selection: General Formulation, Proc. IJCNN 99, DC, USA, July 10–16, 1999, Vol. 1 of 6, pp. 552–557
Google Scholar
L. Xu: Temporal BYY Learning and Its Applications to Extended Kalman Filtering, Hidden Markov Model, and Sensor-Motor Integration, Proc. IJCNN 99, DC, USA, July 10–16, 1999, vol.2 of 6, pp. 949–954
Google Scholar
L. Xu: Bayesian Ying Yang System and Theory as a Unified Statistical Learning Approach:(V) Temporal Modeling for Temporal Perception and Control, Proc. ICONIP 98, Kitakyushu, Japan, Vol. 2, pp. 877–884 (1998)
Google Scholar
L. Xu: Bayesian Kullback Ying-Yang Dependence Reduction Theory, Neurocomputing, 22 (1–3), 81–112 (1998)
Article MATH Google Scholar
L. Xu: RBF Nets, Mixture Experts, and Bayesian Ying Yang Learning, Neurocomputing, Vol. 19, No. 1–3, 223–257 (1998)
Article MATH Google Scholar
L. Xu: Bayesian Ying Yang Dependence Reduction Theory and Blind Source Separation on Instantaneous Mixture, Proc. Intl ICSC Workshop IandANN 98, Feb.9–10, 1998, Tenerife, Spain, pp. 45–51 (1998)
Google Scholar
L. Xu: Bayesian Ying Yang System and Theory as A Unified Statistical Learning Approach:(VI) Convex Divergence, Convex Entropy and Convex Likelihood? Proc. IDEAL98, Oct.14–16, 1998, Hong Kong, pp. 1–12 (1998)
Google Scholar
L. Xu: Bayesian Ying Yang System and Theory as A Unified Statistical Learning Approach: (IV) Further Advances, Proc. IJCNN 98, May 5–9, 1998, Anchorage, Alaska, Vol. 2, pp. 1275–1270 (1998)
Google Scholar
L. Xu: BKYY Dimension Reduction and Determination, Proc. IJCNN98, May 5–9, 1998, Anchorage, Alaska, Vol. 3, pp. 1822–1827 (1998)
Google Scholar
L. Xu, CC. Cheung, SI. Amari- Learned Parametric Mixture Based ICA Algorithm, Neurocomputing, 22 (1–3), 69–80 (1998)
Article MATH Google Scholar
L. Xu, CC. Cheung, SI. Amari: Further Results on Nonlinearity and Separation Capability of A Linear Mixture ICA Method and Learned Parametric Mixture Algorithm, Proc. I and ANN 98, Feb.9–10, 1998, Tenerife, Spain, pp. 39–44 (1998)
Google Scholar
L. Xu: Bayesian Ying Yang System and Theory as A Unified Statistical Learning Approach: (I) Unsupervised and Semi-Unsupervised Learning. In: S. Amari, N. Kassabov (eds.), Brain-like Computing and Intelligent Information Systems, Springer-Verlag, pp. 241–274 (1997)
Google Scholar
L. Xu: Bayesian Ying Yang System and Theory as A Unified Statistical Learning Approach (II): From Unsupervised Learning to Supervised Learning and Temporal Modeling. In: KM. Wong et al. (eds.), Theoretical Aspects of Neural Computation: A Multidisciplinary Perspective, Springer-Verlag, pp. 2542 (1997)
Google Scholar
L. Xu: Bayesian Ying Yang System and Theory as A Unified Statistical Learning Approach (III): Models and Algorithms for Dependence Reduction, Data Dimension Reduction, ICA and Supervised Learning. In: KM. Wong et al. (eds.), Theoretical Aspects of Neural Computation: A Multidisciplinary Perspective, Springer-Verlag, pp. 43–60 (1997)
Google Scholar
L. Xu: Bayesian Ying Yang Machine, Clustering and Number of Clusters, Pattern Recognition Letters 18, No. 11–13, 1167–1178 (1997)
Article Google Scholar
L. Xu, CC. Cheung, HH. Yang, SI. Amari: Independent Component Analysis by The Information-Theoretic Approach with Mixture of Density, Proc. IJCNN97, Vol. 3, 1821–1826 (1997)
Google Scholar
L. Xu, CC. Cheung, J. Ruan, SI. Amari: Nonlinearity and Separation Capability: Further Justification for the ICA Algorithm with A Learned Mixture of Parametric Densities, Proc. ESANN97, Bruges, April 16–18, 1997, pp. 291–296 (1997)
Google Scholar
L. Xu: Bayesian Ying Yang Learning Based ICA Models, Proc. 1997 IEEE NNSP VII, Sept.24–26, 1997, Florida, pp. 476–485 (1997)
Google Scholar
L. Xu: New Advances on Bayesian Ying Yang Learning System with Kullback and Non-Kullback Separation Functionals, Proc. IJCNN97, June 9–12, 1997, Houston, TX, USA, Vol. 3, pp. 1942–1947 (1997)
Google Scholar
L. Xu: Bayesian-Kullback YING-YANG Learning Scheme: Reviews and New Results, Proc. ICONIP 96, Vol. 1, 59–67 (1996)
Google Scholar
L. Xu: Bayesian-Kullback YING-YANG Machines for Supervised Learning, Proc. WCNN 96, Sept.15–18, 1996, San Diego, CA, pp. 193–200 (1996)
Google Scholar
L. Xu: A Maximum Balanced Mapping Certainty Principle for Pattern Recognition and Associative Mapping, Proc. WCNN 96, Sept. 15–18, 1996, San Diego, CA, pp. 946–949 (1996)
Google Scholar
L. Xu: How Many Clusters?: A YING-YANG Machine Based Theory for A Classical Open Problem in Pattern Recognition, Proc. IEEE ICNN 96, June 2–6, 1996, DC, Vol. 3, pp. 1546–1551 (1996)
Google Scholar
L. Xu, SI. Amari. A general independent component analysis framework based on Bayesian Kullback Ying Yang Learning, Proc. ICONIP 96, 1253–1240 (1996)
Google Scholar
L. Xu, HH. Yang, SI. Amari: Signal Source Separation by Mixtures Accumulative Distribution Functions or Mixture of Bell-Shape Density Distribution Functions, Research Proposal, presented at FRONTIER FORUM, organized by S. Amari, S. Tanaka, A. Cichocki, RIKEN, Japan (April 10, 1996 )
Google Scholar
L. Xu: Bayesian-Kullback Coupled YING-YANG Machines: Unified Learnings and New Results on Vector Quantization, Proc. ICONIP 95, Oct 30-Nov.3, 1995, Beijing, China, pp. 977–988 (1995)
Google Scholar
L. Xu: YING-YANG Machine for Temporal Signals, Keynote Talk, Proc. 1995 IEEE Intl Conf. on Neural Networks and Signal Processing, Dec. 10–13, 1995, Nanjing, Vol. 1, pp. 644–651 (1995)
Google Scholar
L. Xu: New Advances on The YING-YANG Machine, Proc. Intl. Symp. on Artificial Neural Networks, Dec.18–20, 1995, Taiwan, pp. 07–12 (1995)
Google Scholar
L. Xu: A unified learning framework: multisets modeling learning, Proceed-ings of 1995 World Congress on Neural Networks, Vol. 1, pp. 35–42 (1995)
Google Scholar
L. Xu, MI. Jordan, GE. Hinton: An Alternative Model for Mixtures of Experts. In: JD. Cowan et al. (eds.), Advances in Neural Information Processing Systems 7, MIT Press, 633–640 (1995)
Google Scholar
L. Xu: Multisets Modeling Learning: An Unified Theory for Supervised and Unsupervised Learning, Invited Talk, Proc. IEEE ICNN 94, June 26-July 2, 1994, Orlando, Florida, Vol. 1, pp. 315–320 (1994)
Google Scholar
L. Xu: Least mean square error reconstruction for self-organizing neural-nets, Neural Networks 6, 627–648, 1993. Its early version on Proc. IJCNN91 Singapore, 2363–2373 (1991)
Google Scholar
L. Xu, E. Oja: Randomized Hough Transform (RHT): Basic Mechanisms, Algorithms and Complexities, Computer Vision, Graphics, and Image Processing: Image Understanding, Vol. 57, no. 2, pp. 131–154 (1993)
Google Scholar
L. Xu, E. Oja, P. Kultanen: A New Curve Detection Method: Randomized Hough Transform (RHT), Pattern Recognition Letters, 11, 331–338 (1990)
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

Chinese University of Hong Kong, Hong Kong
Lei Xu

Authors

Lei Xu
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Xu, L. (2004). Bayesian Ying Yang Learning (I): A Unified Perspective for Statistical Modeling. In: Intelligent Technologies for Information Analysis. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-07952-2_22

Download citation

DOI: https://doi.org/10.1007/978-3-662-07952-2_22
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-07378-6
Online ISBN: 978-3-662-07952-2
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics