Skip to main content

Nonlinear Estimators and Tail Bounds for Dimension Reduction in l 1 Using Cauchy Random Projections

  • Conference paper
Learning Theory (COLT 2007)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4539))

Included in the following conference series:

Abstract

For dimension reduction in l 1, one can multiply a data matrix A ∈ ℝn×D by R ∈ ℝD×k (k ≪ D) whose entries are i.i.d. samples of Cauchy. The impossibility result says one can not recover the pairwise l 1 distances in A from B = AR ∈ ℝn×k, using linear estimators. However, nonlinear estimators are still useful for certain applications in data stream computations, information retrieval, learning, and data mining.

We propose three types of nonlinear estimators: the bias-corrected sample median estimator, the bias-corrected geometric mean estimator, and the bias-corrected maximum likelihood estimator. We derive tail bounds for the geometric mean estimator and establish that \(k = O\left(\frac{\log n}{\epsilon^2}\right)\) suffices with the constants explicitly given. Asymptotically (as k→ ∞), both the sample median estimator and the geometric mean estimator are about 80% efficient compared to the maximum likelihood estimator (MLE). We analyze the moments of the MLE and propose approximating the distribution of the MLE by an inverse Gaussian.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Zhu, J., Rosset, S., Hastie, T., Tibshirani, R.: 1-norm support vector machines. In: NIPS, Vancouver, BC, Canada (2003)

    Google Scholar 

  • Chapelle, O., Haffner, P., Vapnik, V.N.: Support vector machines for histogram-based image classification. IEEE Trans. Neural Networks 10, 1055–1064 (1999)

    Article  Google Scholar 

  • Indyk, P.: Stable distributions, pseudorandom generators, embeddings, and data stream computation. Journal of ACM 53, 307–323 (2006)

    Article  MathSciNet  Google Scholar 

  • Li, P.: Very sparse stable random projections, estimators and tail bounds for stable random projections. Technical report, http://arxiv.org/PS_cache/cs/pdf/0611/0611114.pdf (2006)

  • Zolotarev, V.M.: One-dimensional Stable Distributions. American Mathematical Society, Providence, RI (1986)

    Google Scholar 

  • Vempala, S.: The Random Projection Method. American Mathematical Society, Providence, RI (2004)

    Google Scholar 

  • Johnson, W.B., Lindenstrauss, J.: Extensions of Lipschitz mapping into Hilbert space. Contemporary Mathematics 26, 189–206 (1984)

    MATH  MathSciNet  Google Scholar 

  • Lee, J.R., Naor, A.: Embedding the diamond graph in l p and dimension reduction in l 1. Geometric And. Functional Analysis 14, 745–747 (2004)

    Article  MATH  MathSciNet  Google Scholar 

  • Brinkman, B., Charikar, M.: On the impossibility of dimension reduction in l 1. Journal of ACM 52, 766–788 (2005)

    Article  MathSciNet  Google Scholar 

  • Babcock, B., Babu, S., Datar, M., Motwani, R., Widom, J.: Models and issues in data stream systems. In: PODS, Madison, WI, pp. 1–16 (2002)

    Google Scholar 

  • Li, P., Church, K.W.: Using sketches to estimate associations. In: HLT/EMNLP, Vancouver, BC, Canada, pp. 708–715 ( (2005)

    Google Scholar 

  • Li, P., Church, K.W., Hastie, T.J.: Conditional random sampling: A sketch-based sampling technique for sparse data. In: NIPS, Vancouver, BC, Canada (2007)

    Google Scholar 

  • Li, P., Church, K.W.: A sketch algorithm for estimating two-way and multi-way associations. Computational Linguistics, To Appear (2007)

    Google Scholar 

  • Achlioptas, D.: Database-friendly random projections: Johnson-Lindenstrauss with binary coins. Journal of Computer and System Sciences 66, 671–687 (2003)

    Article  MATH  MathSciNet  Google Scholar 

  • Li, P., Hastie, T.J., Church, K.W.: Improving random projections using marginal information. In: COLT, Pittsburgh, PA, pp. 635–649 (2006)

    Google Scholar 

  • Arriaga, R., Vempala, S.: An algorithmic theory of learning: Robust concepts and random projection. Machine Learning 63, 161–182 (2006)

    Article  MATH  Google Scholar 

  • Fama, E.F., Roll, R.: Parameter estimates for symmetric stable distributions. Journal of the American Statistical Association 66, 331–338 (1971)

    Article  MATH  Google Scholar 

  • Gradshteyn, I.S., Ryzhik, I.M.: Table of Integrals, Series, and Products, 5th edn. Academic Press, London (1994)

    MATH  Google Scholar 

  • Li, P., Paul, D., Narasimhan, R., Cioffi, J.: On the distribution of SINR for the MMSE MIMO receiver and performance analysis. IEEE Trans. Inform. Theory 52, 271–286 (2006)

    Article  MathSciNet  Google Scholar 

  • Seshadri, V.: The Inverse Gaussian Distribution: A Case Study in Exponential Families. Oxford University Press, New York (1993)

    Google Scholar 

  • Philips, T.K., Nelson, R.: The moment bound is tighter than Chernoff’s bound for positive tail probabilities. The American Statistician 49, 175–178 (1995)

    Article  MathSciNet  Google Scholar 

  • Lugosi, G.: Concentration-of-measure inequalities. Lecture Notes (2004)

    Google Scholar 

  • Shenton, L.R., Bowman, K.: Higher moments of a maximum-likelihood estimate. Journal of Royal Statistical Society B 25, 305–317 (1963)

    MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Nader H. Bshouty Claudio Gentile

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer Berlin Heidelberg

About this paper

Cite this paper

Li, P., Hastie, T.J., Church, K.W. (2007). Nonlinear Estimators and Tail Bounds for Dimension Reduction in l 1 Using Cauchy Random Projections. In: Bshouty, N.H., Gentile, C. (eds) Learning Theory. COLT 2007. Lecture Notes in Computer Science(), vol 4539. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-72927-3_37

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-72927-3_37

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-72925-9

  • Online ISBN: 978-3-540-72927-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics