Advertisement

Inference of a Dyadic Measure and Its Simplicial Geometry from Binary Feature Data and Application to Data Quality

  • Linda NessEmail author
Chapter
Part of the Association for Women in Mathematics Series book series (AWMS, volume 17)

Abstract

We propose a new method for representing data sets with an ordered set of binary features which summarizes both measure-theoretic and topological properties. The method does not require any assumption of metric space properties for the data. A data set with an ordered set of binary features is viewed as a dyadic set with a dyadic measure. We prove that dyadic sets with dyadic measures have a canonical set of binary features and determine canonical nerve simplicial complexes. The method computes the two related representations: multiscale parameters for the dyadic measure and the Betti numbers of the simplicial complex. The dyadic product formula representation formulated in previous work is exploited. The parameters characterize the relative skewness of the measure at dyadic scales and localities. The more abstract Betti number statistics summarize the simplicial geometry of the support of the measure. We prove that they provide a simple privacy property. Our methods are compared with other results for measures on sets with tree structures, recent multi-resolution theory, and computational topology. We illustrate the method on a data quality data set and propose future research directions.

Notes

Acknowledgements

The author gratefully acknowledges the CCICADA Center at Rutgers and the CCICADA Data Quality Team for providing the raw data quality statistics and thanks Christie Nelson for explaining them. The author also gratefully acknowledges use of the open source Computational Homology Project software (CHomP)[10] and thanks Shaun Harker for assistance with the installation and use of the software. This work was partially enabled by DIMACS through support from the National Science Foundation under Grant No. CCF-1445755 and partially supported by DARPA SocialSim-W911NF-17-C-0098.

References

  1. 1.
    L. Ahlfors, Lectures on Quasi-Conformal Mappings, vol. 10 (van Nostrand Mathematical Studies, Princeton, 1966)zbMATHGoogle Scholar
  2. 2.
    D. Bassu, P.W. Jones, L. Ness, D. Shallcross, Product Formalisms for Measures on Spaces with Binary Tree Structures: Representation, Visualization and Multiscale Noise, submitted to SIGMA Forum of Maths (under revision) (2016). https://arxiv.org/abs/1601.02946
  3. 3.
    A. Beurling, L. Ahlfors, The boundary correspondence under quasi-conformal mappings. Acta Math. 96, 125–142 (1956)MathSciNetCrossRefGoogle Scholar
  4. 4.
    L. Billera, S. Holmes, K. Vogtmann, Geometry of the space of phylogenetic trees. Adv. Appl. Math. 27, 733–767 (2001)MathSciNetCrossRefGoogle Scholar
  5. 5.
    C. Dwork, A. Roth, The algorithmic foundations of differential privacy. Found. Trends Theor. Comput. Sci. 9, 211–401 (2014)MathSciNetCrossRefGoogle Scholar
  6. 6.
    H. Edelsbrunner, J. Harer, Persistent homology—a survey. Contemp. Math. 453, 257–282 (2008)MathSciNetCrossRefGoogle Scholar
  7. 7.
    F. Fasy, B. Lecci, A. Rinaldo, L. Wasserman, S. Balakrishnan, A. Singh, Confidence sets for persistence diagrams. Ann. Stat. 42, 2301–2339 (2014)MathSciNetCrossRefGoogle Scholar
  8. 8.
    R. Fefferman, C. Kenig, J. Pipher, The theory of weights and the Dirichlet problem for elliptical equations. Ann. Math. 134, 65–124 (1991)MathSciNetCrossRefGoogle Scholar
  9. 9.
    M. Gavish, B. Nadler, R. Coifman, Multiscale wavelets on trees, graphs and high dimensional data: theory and applications to semi supervised learning, in Proceedings of the 27th International Conference on Machine Learning (Omnipress, Madison, 2010), pp. 367–374Google Scholar
  10. 10.
    S. Harker, K. Mischaikow, M. Mrozek, V. Nanda, Discrete Morse theoretic algorithms for computing homology of complexes and maps. Found. Comput. Math. 14, 151–184 (2014)MathSciNetCrossRefGoogle Scholar
  11. 11.
    M.T. Kaczynski, M.K. Mrozek, Computational Homology in Applied Mathematical Sciences 157 (Springer, New York, 2004)Google Scholar
  12. 12.
    J.-P. Kahane, Sur le chaos multiplicative. Ann. Sci. Math. 9, 105–150 (1985)zbMATHGoogle Scholar
  13. 13.
    E. Kolaczyk, R. Nowak, Multiscale likelihood analysis and complexity penalized estimation. Ann. Stat. 32, 500–527 (2004)MathSciNetCrossRefGoogle Scholar
  14. 14.
    X. Meng, A trio of inference problems that could win you a Nobel Prize in statistics (if you help fund it), in Past, Present, Future Stat. Sci. (CRC Press, Boca Raton, 2014), pp. 537–562Google Scholar
  15. 15.
    L. Ness, Dyadic product formula representations of confidence measures and decision rules for dyadic data set samples, in MISNC SI DS 201 (ACM, New York, 2016)Google Scholar
  16. 16.
    R. Rhodes, V. Vargas, Gaussian multiplicative chaos and applications: a review. Probab. Surv. 11, 315–392 (2014)MathSciNetCrossRefGoogle Scholar
  17. 17.
    K. Turner, S. Mukhurjee, D. Boyer, Persistent homology transform modeling shapes and surfaces. Inf. Inf. 3, 310–344 (2014)MathSciNetzbMATHGoogle Scholar

Copyright information

© The Author(s) and the Association for Women in Mathematics 2019

Authors and Affiliations

  1. 1.Rutgers UniversityNew BrunswickUSA

Personalised recommendations