Skip to main content

Mining Textual Reviews with Hierarchical Latent Tree Analysis

  • Conference paper
  • First Online:
Data Mining and Big Data (DMBD 2017)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10387))

Included in the following conference series:

  • 3867 Accesses

Abstract

Collecting feedback from customers is an important task of any business if they hope to retain customers and improve their quality of service. Nowadays, customers can enter reviews on many websites. The vast number of textual reviews make it difficult for customers or businesses to read directly. To analyze text data, topic modeling methods are usually used. In this paper, we propose to analyze textual reviews using a recently developed topic modeling method called hierarchical latent tree analysis, which has been shown to produce topic hierarchy better than some state-of-the-art topic modeling methods. We test the method using textual reviews written about restaurants on the Yelp website. We show that the topic hierarchy reveals useful insights about the reviews. We further show how to find interesting topics specific to locations.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://www.yelp.com/.

  2. 2.

    https://github.com/optimaize/language-detector.

  3. 3.

    https://cloud.google.com/translate/.

  4. 4.

    https://github.com/kmpoon/hlta.

References

  1. Blei, D.M.: Probabilistic topic models. Commun. ACM 55(4), 77–84 (2012)

    Article  Google Scholar 

  2. Blei, D.M., Griffiths, T.L., Jordan, M.I.: The nested Chinese restaurant process and Bayesian nonparametric inference of topic hierarchies. J. ACM 57(2), 7:1–7:30 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  3. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)

    MATH  Google Scholar 

  4. Chen, P., Zhang, N.L., Poon, L.K.M., Chen, Z.: Progressive EM for latent tree models and hierarchical topic detection. In: Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence (2016)

    Google Scholar 

  5. Chen, T., Zhang, N.L., Liu, T., Poon, K.M., Wang, Y.: Model-based multidimensional clustering of categorical data. Artif. Intell. 176, 2246–2269 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  6. Cover, T.M., Thomas, J.A.: Elements of Information Theory. Wiley, New York (2006)

    MATH  Google Scholar 

  7. Koller, D., Friedman, N.: Probabilistic Graphical Models: Principles and Techniques. The MIT Press, Cambridge (2009)

    MATH  Google Scholar 

  8. Liu, T., Zhang, N.L., Chen, P.: Hierarchical latent tree analysis for topic detection. In: Calders, T., Esposito, F., Hüllermeier, E., Meo, R. (eds.) ECML PKDD 2014. LNCS, vol. 8725, pp. 256–272. Springer, Heidelberg (2014). doi:10.1007/978-3-662-44851-9_17

    Google Scholar 

  9. Paisley, J., Wang, C., Blei, D.M., Jordan, M.I.: Nested hierarchical Dirichlet processes. IEEE Trans. Pattern Anal. Mach. Intell. 37(2), 256–270 (2015)

    Article  Google Scholar 

  10. Pearl, J.: Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann Publishers, San Mateo (1988)

    MATH  Google Scholar 

  11. Strehl, A., Ghosh, J.: Cluster ensembles – a knowledge reuse framework for combining multiple partitions. J. Mach. Learn. Res. 3, 583–617 (2002)

    MathSciNet  MATH  Google Scholar 

  12. Suresh, H., Locascio, N.: Autodetection and classification of hidden cultural city districts from Yelp reviews. arXiv:1501.02527 [cs.CL] (2015)

  13. Zhang, N.L.: Hierarchical latent class models for cluster analysis. J. Mach. Learn. Res. 5, 697–723 (2004)

    MathSciNet  MATH  Google Scholar 

Download references

Acknowledgment

The work was supported in part by the Education University of Hong Kong under grant RG90/2014-2015R and in part by Hong Kong Research Grants Council under grants 16202515 and 16212516.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Leonard K. M. Poon .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Poon, L.K.M., Leung, C.F., Zhang, N.L. (2017). Mining Textual Reviews with Hierarchical Latent Tree Analysis. In: Tan, Y., Takagi, H., Shi, Y. (eds) Data Mining and Big Data. DMBD 2017. Lecture Notes in Computer Science(), vol 10387. Springer, Cham. https://doi.org/10.1007/978-3-319-61845-6_40

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-61845-6_40

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-61844-9

  • Online ISBN: 978-3-319-61845-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics