Skip to main content

Experiences Using BDS: A Crawler for Social Internetworking Scenarios

  • Chapter
  • First Online:
  • 2341 Accesses

Part of the book series: Lecture Notes in Social Networks ((LNSN))

Abstract

In new generation social networks, we expect that the paradigm of Social Internetworking Scenarios (SISs) will be more and more important. In this new scenario, the role of Social Network Analysis is of course still crucial but the preliminary step to do is designing a good way to crawl the underlying graph. While this aspect has been deeply investigated in the field of social networks, it is an open issue when moving towards SISs. Indeed, we cannot expect that a crawling strategy, good for social networks, is still valid in a Social Internetworking scenario, due to the specific topological features of this scenario. In this paper, we first confirm the above claim and then, define a new crawling strategy specifically conceived for SISs, which overcomes the drawbacks of the state-of-the-art crawling strategies. After this, we exploit this crawling strategy to investigate SISs to understand their main properties and features of their main actors (i.e., bridges).

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   54.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    Clearly, if a node has no incoming edges, it maintains its weight.

References

  1. Agarwal N, Galan M, Liu H, Subramanya S (2010) WisColl: collective wisdom based blog clustering. Inf Sci 180(1):39–61

    Article  Google Scholar 

  2. Ahn YY, Han S, Kwak H, Moon S, Jeong H (2007) Analysis of topological characteristics of huge online social networking services. In: Proceedings of the international conference on world wide web (WWW’07), Banff, Alberta. ACM, New York, pp 835–844

    Google Scholar 

  3. Backstrom L, Huttenlocher D, Kleinberg J, Lan X (2006) Group formation in large social networks: membership, growth, and evolution. In: Proceedings of the ACM SIGKDD international conference on knowledge discovery and data mining (KDD’06), Philadelphia. ACM, New York, pp 44–54

    Google Scholar 

  4. Berlingerio M, Coscia M, Giannotti F, Monreale A, Pedreschi D (2010) Towards discovery of eras in social networks. In: Proceedings of the workshops of the international conference on data engineering (ICDE 2010), Long Beach. IEEE, Los Alamitos, CA, USA, pp 278–281

    Google Scholar 

  5. Berlingerio M, Coscia M, Giannotti F, Monreale A, Pedreschi D (2011) Foundations of multidimensional network analysis. In: Proceedings of the international conference on advances in social networks analysis and mining (ASONAM 2011), Kaohsiung. IEEE, Los Alamitos, CA, USA, pp 485–489

    Google Scholar 

  6. Berlingerio M, Coscia M, Giannotti F, Monreale A, Pedreschi D (2011) The pursuit of hubbiness: analysis of hubs in large multidimensional networks. J Comput Sci 2(3):223–237

    Article  Google Scholar 

  7. Bonneau J, Anderson J, Danezis G (2009) Prying data out of a social network. In: Proceedings of the international conference on advances in social network analysis and mining (ASONAM’09), Athens. IEEE, Los Alamitos, CA, USA, pp 249–254

    Google Scholar 

  8. Brickley D, Miller L (2012) The friend of a friend (FOAF) project. http://www.foaf-project.org/

  9. Buccafurri F, Lax G, Nocera A, Ursino D (2012) Crawling social internetworking systems. In: Proceedings of the international conference on advances in social analysis and mining (ASONAM 2012), Istanbul. IEEE Computer Society, Los Alamitos, pp 505–509

    Google Scholar 

  10. Buccafurri F, Lax G, Nocera A, Ursino D (2012) Discovering links among social networks. In: Proceedings of the European conference on machine learning and principles and practice of knowledge discovery in databases (ECML PKDD 2012), Bristol. Lecture notes in computer science. Springer, Berlin, pp 467–482

    Google Scholar 

  11. Buccafurri F, Foti VD, Lax G, Nocera A, Ursino D (2013) Bridge analysis in a social internetworking scenario. Inf Sci 224:1–18

    Article  MathSciNet  Google Scholar 

  12. Carrington P, Scott J, Wasserman S (2005) Models and methods in social network analysis. Cambridge University Press, Cambridge

    Book  Google Scholar 

  13. Catanese SA, De Meo P, Ferrara E, Fiumara G, Provetti A (2011) Crawling Facebook for social network analysis purposes. In: Proceedings of the international conference series on web intelligence, mining and semantics (WIMS’11), Sogndal. ACM, New York, pp 52–59

    Google Scholar 

  14. Chau DH, Pandit S, Wang S, Faloutsos C (2007) Parallel crawling for online social networks. In: Proceedings of the international conference on world wide web (WWW’07), Banff, Alberta. ACM, New York, pp 1283–1284

    Google Scholar 

  15. Cheng X, Dale C, Liu J (2008) Statistics and social network of Youtube videos. In: Proceedings of the international workshop on quality of service (IWQoS 2008), Enschede. IEEE, Los Alamitos, CA, USA, pp 229–238

    Google Scholar 

  16. Clauset A, Shalizi CR, Newman MEJ (2009) Power-law distributions in empirical data. SIAM Rev 51(4):661–703

    Article  MATH  MathSciNet  Google Scholar 

  17. Dai BT, Chua FCT, Lim EP (2012) Structural analysis in multi-relational social networks. In: Proceedings of the international SIAM conference on data mining (SDM 2012), Anaheim. Omnipress, Madison, pp 451–462

    Google Scholar 

  18. De Meo P, Ferrara E, Fiumara G, Provetti A (2011) Generalized Louvain method for community detection in large networks. In: Proceedings of the international conference on intelligent systems design and applications (ISDA 2011), Cordoba. IEEE, Los Alamitos, CA, USA, pp 88–93

    Google Scholar 

  19. de Sola Pool I, Kochen M (1978) Contacts and influence. Soc Netw 1:5–51

    Article  Google Scholar 

  20. Freeman LC (1979) Centrality in social networks conceptual clarification. Soc Netw 1(3): 215–239

    Article  Google Scholar 

  21. FriendFeed (2012). http://friendfeed.com/

  22. Gathera (2012). http://www.gathera.com/

  23. Ghosh R, Lerman K (2010) Predicting influential users in online social networks. In: Proceedings of the KDD international workshop on social network analysis (SNA-KDD’10), San Diego. ACM, New York

    Google Scholar 

  24. Google Open Social (2012). http://code.google.com/intl/it-IT/apis/opensocial/

  25. Gilbert E, Karahalios K (2009) Predicting tie strength with social media. In: Proceedings of the international conference on human factors in computing systems (CHI’09), Boston. ACM, New York, pp 211–220

    Google Scholar 

  26. Gilbert AC, Levchenko K (2004) Compressing network graphs. In: Proceedings of the international workshop on link analysis and group detection (LinkKDD’04), Seattle. ACM, New York

    Google Scholar 

  27. Gjoka M, Kurant M, Butts CT, Markopoulou A (2010) Walking in Facebook: a case study of unbiased sampling of OSNs. In: Proceedings of the international conference on computer communications (INFOCOM’10), San Diego. IEEE, Los Alamitos, CA, USA, pp 1–9

    Google Scholar 

  28. Granovetter MS (1973) The strength of weak ties. Am J Sociol 78(6):1360–1380

    Article  Google Scholar 

  29. Kahn AB (1962) Topological sorting of large networks. Commun ACM 5(11):558–562

    Article  MATH  Google Scholar 

  30. Kazienko P, Musial K, Kukla E, Kajdanowicz T, Bródka P (2011) Multidimensional social network: model and analysis. In: Proceedings of the international conference on computational collective intelligence (ICCCI 2011), Gdynia. Springer, Berlin, pp 378–387

    Google Scholar 

  31. Kleinberg J (2008) The convergence of social and technological networks. Commun ACM 51(11):66–72

    Article  Google Scholar 

  32. Korolova A, Motwani R, Nabar SU, Xu Y (2008) Link privacy in social networks. In: Proceedings of the ACM international conference on information and knowledge management (CIKM’08), Napa Valley. ACM, New York, pp 289–298

    Google Scholar 

  33. Krishnamurthy V, Faloutsos M, Chrobak M, Lao L, Cui JH, Percus A (2005) Reducing large internet topologies for faster simulations. In: Proceedings of the international conference on networking (Networking 2005), Waterloo, Ontario. Springer, Berlin, pp 165–172

    Google Scholar 

  34. Krishnamurthy B, Gill P, Arlitt M (2008) A few chirps about Twitter. In: Proceedings of the first workshop on online social networks, Seattle, pp 19–24

    Google Scholar 

  35. Kumar R, Novak J, Tomkins A (2010) Structure and evolution of online social networks. In: Link mining: models, algorithms, and applications, Springer, New York, pp 337–357

    Google Scholar 

  36. Kurant M, Markopoulou A, Thiran P (2010) On the bias of BFS (Breadth First Search). In: Proceedings of the international teletraffic congress (ITC 22), Amsterdam. IEEE, Los Alamitos, CA, USA, pp 1–8

    Google Scholar 

  37. Lee SH, Kim PJ, Jeong H (2006) Statistical properties of sampled networks. Phys Rev E 73(1):016102

    Article  Google Scholar 

  38. Leskovec J, Faloutsos C (2006) Sampling from large graphs. In: Proceedings of the ACM SIGKDD international conference on knowledge discovery and data mining (KDD’06), Philadelphia. ACM, New York, pp 631–636

    Google Scholar 

  39. Li YM, Lai CY, Chen CW (2011) Discovering influencers for marketing in the blogosphere. Inf Sci 181(23):5143–5157

    Article  Google Scholar 

  40. Liben-Nowell D, Novak J, Kumar R, Raghavan P, Tomkins A (2005) Geographic routing in social networks. Proc Natl Acad Sci USA 102(33):11623–11628

    Article  Google Scholar 

  41. Lovász L (1993) Random walks on graphs: a survey. In: Combinatorics, Paul Erdos is eighty, vol 2, no 1, Springer, Heidelberg, Germany, pp 1–46

    Google Scholar 

  42. Mathioudakis M, Koudas N (2009) Efficient identification of starters and followers in social media. In: Proceedings of the international conference on extending database technology: advances in database technology (EDBT ’09), Saint Petersburg. ACM, New York, pp 708–719

    Google Scholar 

  43. Mislove A, Marcon M, Gummadi KP, Druschel P, Bhattacharjee B (2007) Measurement and analysis of online social networks. In: Proceedings of the ACM SIGCOMM international conference on internet measurement (IMC’07), San Diego. ACM, New York, pp 29–42

    Google Scholar 

  44. Mislove A, Koppula HS, Gummadi KP, Druschel F, Bhattacharjee B (2008) Growth of the Flickr social network. In: Proceedings of the international workshop on online social networks (WOSN’08), Seattle. ACM, New York, pp 25–30

    Google Scholar 

  45. Monclar R, Tecla A, Oliveira J, de Souza JM (2009) MEK: using spatial–temporal information to improve social networks and knowledge dissemination. Inf Sci 179(15):2524–2537

    Article  Google Scholar 

  46. Mucha PJ, Richardson T, Macon K, Porter MA, Onnela J (2010) Community structure in time-dependent, multiscale, and multiplex networks. Science 328(5980):876–878

    Article  MATH  MathSciNet  Google Scholar 

  47. Musiał K, Juszczyszyn K (2009) Properties of bridge nodes in social networks. In: Proceedings of the international conference on computational collective intelligence (ICCCI 2009), Wroclaw. Springer, Berlin, pp 357–364

    Google Scholar 

  48. Newman MEJ (2002) Assortative mixing in networks. Phys Rev Lett 89(20):208701

    Article  Google Scholar 

  49. Onnela JP, Reed-Tsochas F (2010) Spontaneous emergence of social influence in online systems. Proc Natl Acad Sci 107(43):18375

    Article  Google Scholar 

  50. Perer A, Shneiderman B (2006) Balancing systematic and flexible exploration of social networks. IEEE Trans Vis Comput Graph 12(5):693–700

    Article  Google Scholar 

  51. Power.com (2012). http://techcrunch.com/2008/11/30/powercom-for-social-networking-power-users/

  52. Rafiei D, Curial S (2005) Effectively visualizing large networks through sampling. In: Proceedings of the IEEE visualization conference 2005 (VIS’05), Minneapolis. IEEE, Los Alamitos, CA, USA, p 48

    Google Scholar 

  53. Rasti AH, Torkjazi M, Rejaie R, Stutzbach D (2008) Evaluating sampling techniques for large dynamic graphs. Univ. Oregon, Tech. Rep. CIS-TR-08-01

    Google Scholar 

  54. Romero DM, Galuba W, Asur S, Huberman BA (2011) Influence and passivity in social media. In: Proceedings of the international conference on world wide web (WWW’11), Hyderabad. ACM, New York, pp 113–114

    Google Scholar 

  55. Song X, Chi Y, Hino K, Tseng B (2007) Identifying opinion leaders in the blogosphere. In: Proceedings of the ACM international conference on information and knowledge management (CIKM’07), Lisbon. ACM, New York, pp 971–974

    Google Scholar 

  56. Stanford Network Analysis Package (2012). http://snap.stanford.edu/snap/

  57. Stutzback D, Rejaie R, Duffield N, Sen S, Willinger W (2006) On unbiased sampling for unstructured peer-to-peer networks. In: Proceedings of the international conference on internet measurements, Rio De Janeiro. ACM, New York, pp 27–40

    Google Scholar 

  58. Travers J, Milgram S (1969) An experimental study of the small world problem. Sociometry 32(4):425–443

    Article  Google Scholar 

  59. Wilson C, Boe B, Sala A, Puttaswamy KPN, Zhao BY (2009) User interactions in social networks and their implications. In: Proceedings of the ACM European conference on computer systems (EuroSys’09), Nuremberg. ACM, New York, pp 205–218

    Google Scholar 

  60. Wu A, DiMicco JM, Millen DR (2010) Detecting professional versus personal closeness using an enterprise social network site. In: Proceedings of the international conference on human factors in computing systems (CHI’10), Atlanta. ACM, New York, pp 1955–1964

    Google Scholar 

  61. XFN - XHTML Friends Network (2012). http://gmpg.org/xfn

  62. Ye S, Lang J, Wu F (2010) Crawling online social graphs. In: Proceedings of the international Asia-Pacific web conference (APWeb’10), Busan. IEEE, Los Alamitos, CA, USA, pp 236–242

    Google Scholar 

Download references

Acknowledgements

This work has been partially supported by the TENACE PRIN Project (n. 20103P34XC) funded by the Italian Ministry of Education, University and Research.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Francesco Buccafurri .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer-Verlag Wien

About this chapter

Cite this chapter

Buccafurri, F., Lax, G., Nocera, A., Ursino, D. (2014). Experiences Using BDS: A Crawler for Social Internetworking Scenarios. In: Gündüz-Öğüdücü, Ş., Etaner-Uyar, A. (eds) Social Networks: Analysis and Case Studies. Lecture Notes in Social Networks. Springer, Vienna. https://doi.org/10.1007/978-3-7091-1797-2_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-7091-1797-2_8

  • Published:

  • Publisher Name: Springer, Vienna

  • Print ISBN: 978-3-7091-1796-5

  • Online ISBN: 978-3-7091-1797-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics