Two-Stage Sampling Method for Social Media Bigdata
In recent years, social media has become the most popular Internet application, and thereby multidisciplinary researchers involve the research of social media big data. Many empirical studies indicate that sampling is one of the valid data processing method to study domain problems. However, there are still some unresolved problems such as sampling-selection-method and sampling evaluation method in the existing sampling method. We proposed a novel two-stage sampling method aiming to improve sampling quality, whose basic idea is the concept of divide and conquer. First, a seed network with the property of scale-free and small-world is established. Second, Metropolis-Hasting sampling method, improved on the snowball method, is applied to generate a sample network. The actual test results indicate the credibility of the two-stage sampling method is significantly better than those of the existing sampling methods both at the macro level and the micro level.
KeywordsBigdata Two stage sampling Online social network
- 12.Davidson, J., Liebald, B., Liu, J., et al.: The YouTube video recommendation system. In: Proceedings of the Fourth ACM Conference on Recommender systems. ACM, pp. 293–296 (2016)Google Scholar
- 13.Shani, G., Gunawardana, A.: Evaluating recommendation systems. In: Recommender Systems Handbook. Springer, pp. 257–297 (2017)Google Scholar
- 15.Socialnomics, Q.E.: How Social Media Transforms the Way We Live and Do Business. Wiley, Hoboken (2014)Google Scholar