Skip to main content

Pareto Density Estimation: A Density Estimation for Knowledge Discovery

  • Conference paper

Abstract

Pareto Density Estimation (PDE) as defined in this work is a method for the estimation of probability density functions using hyperspheres. The radius of the hyperspheres is derived from optimizing information while minimizing set size. It is shown, that PDE is a very good estimate for data containing clusters of Gaussian structure. The behavior of the method is demonstrated with respect to cluster overlap, number of clusters, different variances in different clusters and application to high dimensional data. For high dimensional data PDE is found to be appropriate for the purpose of cluster analysis. The method is tested successfully on a difficult high dimensional real world problem: stock picking in falling markets.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • DEBOECK, G.J. and ULTSCH, A. (2002): Picking Stocks with Emergent Self-Organizing Value Maps. In: M. Novak (Ed.): Neural Networks World, 10,1–2, 203–216.

    Google Scholar 

  • DEVROYE, L. and LUGOSI, G. (1996): A universally acceptable smoothing factor for kernel density estimation. Annals of Statistics, 24, 2499–2512.

    Article  MathSciNet  Google Scholar 

  • DEVROYE, L. and LUGOSI, G. (1997): Non-asymptotic universal smoothing factors kernel complexity and Yatracos classes. Annals of Stat., 25, 2626–2637.

    Article  MathSciNet  Google Scholar 

  • DEVROYE, L. and LUGOSI, G. (2000): Variable kernel estimates: on the impossibility of tuning the parameters. In: E. Giné and D. Mason (Eds.): High-Dimensional Probability. Springer-Verlag, New York.

    Google Scholar 

  • ESTER, M., KRIEGEL, H.-P., and SANDER, J. (1996): A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise, Proc. 2nd Int. Conf. On Knowledge Discovery and Data Mining.

    Google Scholar 

  • HALL, P.( 1992): On global properties of variable bandwidth density estimators. Annals of Statistics, 20, 762–778.

    Google Scholar 

  • HINNEBURG, A. and KEIM, D.A. (1998): An Efficient Approach to Clustering in Large Multimedia Databases with Noise, Proc. 4th Int. Conf. on Knowledge Discovery and Data Mining.

    Google Scholar 

  • MARANJIAN, S. (2002): The Best Number of Stocks, The Motley Fool, 26.

    Google Scholar 

  • O’NEIL, W.J. (1995): How to make money in stocks. Mc Gaw Hill, New York.

    Google Scholar 

  • SCOTT, D.W. (1992): Multivariate Density Estimation. Wiley-Interscience, New York.

    Google Scholar 

  • ULTSCH, A. (2001): Eine Begründung der Pareto 80/20 Regel und Grenzwerte für die ABC-Analyse, Technical Report Nr. 30, Department of Computer Science, University of Marburg.

    Google Scholar 

  • ULTSCH, A. (2003): Optimal density estimation in data containing clusters of unknown structure, Technical Report Nr. 34, Department of Computer Science, University of Marburg.

    Google Scholar 

  • XU, X., ESTER, M., KRIEGEL, H.-P., and SANDER, J. (1998): Distribution-Based Clustering Algorithm for Mining in Large Spatial Databases, Proc. Conf. on Data Engineering, 324–331.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin · Heidelberg

About this paper

Cite this paper

Ultsch, A. (2005). Pareto Density Estimation: A Density Estimation for Knowledge Discovery. In: Baier, D., Wernecke, KD. (eds) Innovations in Classification, Data Science, and Information Systems. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-26981-9_12

Download citation

Publish with us

Policies and ethics