Skip to main content
Log in

Towards Better Understanding of App Functions

  • Regular Papers
  • Published:
Journal of Computer Science and Technology Aims and scope Submit manuscript

Abstract

Apps are attracting more and more attention from both mobile and web platforms. Due to the self-organized nature of the current app marketplaces, the descriptions of apps are not formally written and contain a lot of noisy words and sentences. Thus, for most of the apps, the functions of them are not well documented and thus cannot be captured by app search engines easily. In this paper, we study the problem of inferring the real functions of an app by identifying the most informative words in its description. In order to utilize and integrate the diverse information of the app corpus in a proper way, we propose a probabilistic topic model to discover the latent data structure of the app corpus. The outputs of the topic model are further used to identify the function of an app and its most informative words. We verify the effectiveness of the proposed methods through extensive experiments on two real app datasets crawled from Google Play and Windows Phone Store, respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Liu S, Wang S, Zhu F, Zhang J, Krishnan R. HYDRA: Large-scale social identity linkage via heterogeneous behavior modeling. In Proc. ACM SIGMOD, June 2014, pp. 51–62.

  2. Tong Y, Cao C C, Chen L. TCS: Efficient topic discovery over crowd-oriented service data. In Proc. the 20th SIGKDD, August 2014, pp. 861–870.

  3. Baeza-Yates R, Jiang D, Silvestri F, Harrison B. Predicting the next app that you are going to use. In Proc. the 8th WSDM, February 2015, pp. 285–294.

  4. She J, Tong Y, Chen L, Cao C C. Conflict-aware event-participant arrangement. In Proc. the 31st ICDE, April 2015, pp. 735–746.

  5. She J, Tong Y, Chen L. Utility-aware social event-participant planning. In Proc. ACM SIGMOD, May 31-June 4, 2015, pp. 1629–1643.

  6. Tong Y, Meng R, She J. On bottleneck-aware arrangement for event-based social networks. In Proc. the 31st ICDE Workshops, April 2015, pp. 216–223.

  7. Blei D M, Ng A Y, Jordan M I. Latent Dirichlet allocation. Journal of Machine Learning Research, 2003, 3: 993–1022.

  8. Griffiths T L, Steyvers M. Finding scientific topics. Proc. the National Academy of Sciences, 2004, 101(Suppl.1): 5228–5235.

  9. Jo Y, Oh A H. Aspect and sentiment unification model for online review analysis. In Proc. the 4th WSDM, February 2011, pp. 815–824.

  10. Sato I, Nakagawa H. Topic models with power-law using Pitman-Yor process. In Proc. the 16th SIGKDD, July 2010, pp. 673–682.

  11. Wang C, Wang J, Xie X, Ma W Y. Mining geographic knowledge using location aware topic model. In Proc. the 4th ACM Workshop on GIR, November 2007, pp. 65–70.

  12. Yin Z, Cao L, Han J, Zhai C, Huang T. Geographical topic discovery and comparison. In Proc. the 20th WWW, March 28-April 1, 2011, pp. 247–256.

  13. Jiang D, Vosecky J, Leung K W T, Ng W. G-WSTD: A framework for geographic web search topic discovery. In Proc. the 21st CIKM, October 29-November 2, 2012, pp. 1143–1152.

  14. Jiang D, Leung K W T, Ng W, Li H. Beyond click graph: Topic modeling for search engine query log analysis. In Proc. the 18th DASFAA, April 2013, pp. 209–223.

  15. Sizov S. Geofolk: Latent spatial semantics in Web 2.0 social media. In Proc. the 3rd WSDM, February 2010, pp. 281–290.

  16. Eisenstein J, O’Connor B, Smith N A, Xing E P. A latent variable model for geographic lexical variation. In Proc. the EMNLP, October 2010, pp. 1277–1287.

  17. Jiang D, Leung K W T, Vosecky J, Ng W. Personalized query suggestion with diversity awareness. In Proc. the 30th ICDE, March 31-April 4, 2014, pp. 400–411.

  18. Jiang D, Leung K W T, Ng W. Query intent mining with multiple dimensions of web search data. World Wide Web, 2015.

  19. Hao Q, Cai R, Wang C, Xiao R, Yang J M, Pang Y, Zhang L. Equip tourists with knowledge mined from travelogues. In Proc. the 19th WWW, April 2010, pp. 401–410.

  20. Teh Y W. A hierarchical Bayesian language model based on Pitman-Yor processes. In Proc. the 44th ACL, July 2006, pp. 985–992.

  21. El-Arini K. Dirichlet Processes: A Gentle Tutorial. 2008. https://www.cs.cmu.edu/∼kbe/dp_tutorial.pdf, Aug. 2015.

  22. Wallach H M. Structured topic models for language [Ph.D. Thesis]. Univ. Cambridge, 2008.

  23. Rosen-Zvi M, Griffiths T, Steyvers M, Smyth P. The author-topic model for authors and documents. In Proc. the 20th UAI, July 2004, pp. 487–494.

  24. Xia H, Li J, Tang J, Moens M F. Plink-LDA: Using link as prior information in topic modeling. In Proc. the 17th DASFAA, April 2012, pp. 213–227.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yong-Xin Tong.

Additional information

This work is supported in part by the Hong Kong RGC Project under Grant No. N_HKUST637/13, the National Basic Research 973 Program of China under Grant No. 2014CB340303, the National Natural Science Foundation of China under Grant Nos. 61328202 and 61502021, Microsoft Research Asia Gift Grant, Google Faculty Award 2013, and Microsoft Research Asia Fellowship 2012.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tong, YX., She, J. & Chen, L. Towards Better Understanding of App Functions. J. Comput. Sci. Technol. 30, 1130–1140 (2015). https://doi.org/10.1007/s11390-015-1588-0

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11390-015-1588-0

Keywords

Navigation