Networks and Spatial Economics

, Volume 16, Issue 2, pp 545–578 | Cite as

Organization Mining Using Online Social Networks

  • Michael Fire
  • Rami Puzis


Complementing the formal organizational structure of a business are the informal connections among employees. These relationships help identify knowledge hubs, working groups, and shortcuts through the organizational structure. They carry valuable information on how a company functions de facto. In the past, eliciting the informal social networks within an organization was challenging; today they are reflected by friendship relationships in online social networks. In this paper we analyze several commercial organizations by mining data which their employees have exposed on Facebook, LinkedIn, and other publicly available sources. Using a web crawler designed for this purpose, we extract a network of informal social relationships among employees of targeted organizations. Our results show that it is possible to identify leadership roles within the organization solely by using centrality analysis and machine learning techniques applied to the informal relationship network structure. Valuable non-trivial insights can also be gained by clustering an organization’s social network and gathering publicly available information on the employees within each cluster. Knowledge of the network of informal relationships may be a major asset or might be a significant threat to the underlying organization.


Organizational data mining Social network data mining Social network privacy Organizational social network privacy Facebook LinkedIn Machine learning Leadership roles 


  1. Acquisti A, Gross R (2006) Imagined communities: Awareness, information sharing, and privacy on the facebook. In: Privacy enhancing technologies. Springer, pp 36–58Google Scholar
  2. Allen T, Cohen S (1969) Information flow in research and development laboratories, Administrative Science QuarterlyGoogle Scholar
  3. Baker WE, Faulkner RR (1993) The social organization of conspiracy: Illegal networks in the heavy electrical equipment industry, American sociological review, pp 837–860Google Scholar
  4. Boshmaf Y, Muslukhov I, Beznosov K, Ripeanu M (2011) The socialbot network: when bots socialize for fame and money. In: Proceedings of the 27th Annual Computer Security Applications Conference. ACM, pp 93–102Google Scholar
  5. Burt R (1995) Structural holes: rhe social structure of competition. Harvard University PressGoogle Scholar
  6. Campbell C, Maglio P, Cozzi A, Dom B (2003) Expertise identification using email communications. In: Proceedings of the twelfth international conference on Information and knowledge management. ACM, pp 528–531Google Scholar
  7. Cats O, Jenelius E (2014) Dynamic vulnerability analysis of public transport networks: mitigation effects of real-time information. Netw Spatial Economics 14 (3):435–463CrossRefGoogle Scholar
  8. Chesney T, Fire M (2014) Diffusion through networks of heterogeneous nodes in a population characterized by homophily. Nottingham University Business School Research Paper, pp 2014–05Google Scholar
  9. Clauset A, Newman M, Moore C (2004) Finding community structure in very large networks. Phys Rev E 70(6):066,111CrossRefGoogle Scholar
  10. Constine J (2013) Facebooks growth since ipo in 12 big numbers. TechCrunchGoogle Scholar
  11. Diehl C P, Namata G, Getoor L (2007) Relationship identification for social network discovery. AAAI 22:546–552Google Scholar
  12. Diesner J, Frantz T L, Carley K M (2005) Communication networks from the enron email corpus it’s always about the people. enron is no different. Comput Math Org Theory 11(3):201–228CrossRefGoogle Scholar
  13. Ducruet C, Beauguitte L (2014) Spatial Science and Network Science: Review and Outcomes of a Complex Relationship. Netw Spatial Economics 14(3):297316Google Scholar
  14. Dwyer C, Hiltz S, Passerini K (2007) Trust and privacy concern within social networking sites: A comparison of facebook and myspace. In: Proceedings of AMCIS. Citeseer, pp 1–12Google Scholar
  15. Elishar A, Fire M, Kagan D, Elovici Y (2012) Organizational intrusion, ASE Cyber Security Conference (CyberSecurity)Google Scholar
  16. Elyashar A, Fire M, Kagan D, Elovici Y (2013) Homing socialbots: intrusion on a specific organization’s employee using socialbots. In: Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining. ACM, pp 1358–1365Google Scholar
  17. Estrada E, Rodriguez-Velazquez J (2005) Subgraph centrality in complex networks. Phys Rev E 71(5):056,103CrossRefGoogle Scholar
  18. Facebook (2014) Company info. [last accessed on July 27th, 2014].
  19. Fire M, Tenenboim-Chekina L, Puzis R, Lesser O, Rokach L, Elovici Y (2013) Computationally efficient link prediction in a variety of social networks. ACM Trans Intell Syst and Technol (TIST) 5(1):10Google Scholar
  20. Freeman L (1977) A set of measures of centrality based on betweenness, Sociometry :35–41Google Scholar
  21. Gjoka M, Butts C, Kurant M, Markopoulou A (2011) Multigraph sampling of online social networks. Selected Areas in Communications. IEEE J 29 (9):1893–1905Google Scholar
  22. Hagberg A A, Schult D A, Swart P J (2008) Exploring network structure, dynamics, and function using networkx. In: Proceedings of the 7th Python in Science Conference (SciPy2008)Google Scholar
  23. Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten I H (2009) The weka data mining software: an update. SIGKDD Explor Newsl 11:10–18. CrossRefGoogle Scholar
  24. Illenberger J, Nagel K, Flötteröd G (2013) The Role of Spatial Interaction in Social Networks. Netw Spatial Economics 13(3):255–282CrossRefGoogle Scholar
  25. Jacobson E, Seashore S (1951) Communication practices in complex organizations. J Soc Issues 7(3): 28–40CrossRefGoogle Scholar
  26. Kilduff M, Brass D (2010) Organizational social network research: Core ideas and key debates. Acad Manag Ann 4(1):317–357CrossRefGoogle Scholar
  27. Kilduff M, Tsai W (2003) Social networks and organizations. Sage Publications LtdGoogle Scholar
  28. Kleinberg J (1999) Authoritative sources in a hyperlinked environment. J ACM (JACM) 46(5):604–632CrossRefGoogle Scholar
  29. Krackhardt D, Hanson J R (1993) Informal networks: the company behind the chart. Harv Bus Rev 71(4):104–11Google Scholar
  30. Krebs V (2002) Mapping networks of terrorist cells. Connections 24(3):43–52Google Scholar
  31. Lesser O, Tenenboim-Chekina L, Rokach L, Elovici Y (2013) Intruder or welcome friend: inferring group membership in online social networks. In: Social Computing, Behavioral-Cultural Modeling and Prediction. Springer, pp 368–376Google Scholar
  32. Lind P G, González M C, Herrmann H J (2005) Cycles and clustering in bipartite networks. Phys Rev E 72(5):056,127CrossRefGoogle Scholar
  33. Lindsay G (2013) Engineering serendipity. New York TimesGoogle Scholar
  34. McCallum A, Corrada-Emmanuel A, Wang X (2005) Topic and role discovery in social networks. Computer Science Department Faculty Publication Series, p 3Google Scholar
  35. McPherson M, Smith-Lovin L, Cook J (2001) Birds of a feather: Homophily in social networks, Annual review of sociologyGoogle Scholar
  36. Mislove A, Marcon M, Gummadi K P, Druschel P, Bhattacharjee B (2007) Measurement and Analysis of Online Social Networks. In: Proceedings of the 5th ACM/Usenix Internet Measurement Conference (IMC’07), San DiegoGoogle Scholar
  37. Naddafa Y, Mutyalab S (2010) Social network analysis and community mining in organizations based on email recordsGoogle Scholar
  38. Newman M (2008) The mathematics of networks, The New Palgrave Encyclopedia of EconomicsGoogle Scholar
  39. Newman M, et al. (2001) Scientific collaboration networks. ii. shortest paths, weighted networks, and centrality. Phys Rev Ser E-64(1; PART 2):16,132–16,132CrossRefGoogle Scholar
  40. Newman M E (2006) Modularity and community structure in networks. Proc Natl Acad Sci 103(23): 8577–8582CrossRefGoogle Scholar
  41. Page L, Brin S, Motwani R, Winograd T (1999) The pagerank citation ranking, Bringing order to the webGoogle Scholar
  42. Paradise A, Puzis R, Shabtai A (2014) Anti-reconnaissance tools: Detecting targeted socialbots. Internet Computing IEEE PP(99):1–1. doi: 10.1109/MIC.2014.81
  43. Provan K, Fish A, Sydow J (2007) Interorganizational networks at the network level: A review of the empirical literature on whole networks. J Manag 33 (3):479–516Google Scholar
  44. Pugh D, Hickson D, Hinings C, Turner C (1968) Dimensions of organization structure, Administrative science quarterly, pp 65–105Google Scholar
  45. Rooksby J, Kahn A, Keen J, Sommerville I, Rooksby J (2009) Social networking and the workplace, The UK Large Scale Complex IT Systems Initiative, pp 1–39Google Scholar
  46. Saramäki J, Kivelä M, Onnela J P, Kaski K, Kertesz J (2007) Generalizations of the clustering coefficient to weighted complex networks. Phys Rev E 75(2):027,105CrossRefGoogle Scholar
  47. Shannon P, Markiel A, Ozier O, Baliga N, Wang J, Ramage D, Amin N, Schwikowski B, Ideker T (2003) Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res 13(11):2498–2504CrossRefGoogle Scholar
  48. Shetty J, Adibi J (2004) The enron email dataset database schema and brief statistical report. Information sciences institute technical report. University of Southern California, p 4Google Scholar
  49. Shetty J, Adibi J (2005) Discovering important nodes through graph entropy the case of enron email database. In: Proceedings of the 3rd international workshop on Link discovery. ACM , pp 74–81Google Scholar
  50. Sparrow M (1991) The application of network analysis to criminal intelligence: An assessment of the prospects. Soc Networks 13(3):251–274CrossRefGoogle Scholar
  51. Steinfield C, DiMicco J, Ellison N, Lampe C (2009) Bowling online: Social networking and social capital within the organization. In: Proceedings of the fourth international conference on Communities and technologies. ACM, pp 245–254Google Scholar
  52. Tichy N, Tushman M, Fombrun C (1979) Social network analysis for organizations, Academy of Management Review, pp 507–519Google Scholar
  53. Twitter (2013) Rest api rate limiting in v1.1. [last accessed on August 3th, 2014].
  54. Tyler J, Wilkinson D, Huberman B (2005) E-mail as spectroscopy: Automated discovery of community structure within organizations. Inf Soc 21(2):143–153CrossRefGoogle Scholar
  55. Wilkinson D, Huberman B (2004) A method for finding communities of related genes. Proc Natl Acad Sci USA 101(Suppl 1):5241CrossRefGoogle Scholar
  56. Wilson G, Banzhaf W (2009) Discovery of email communication networks from the enron corpus with a genetic algorithm using social network analysis. In: IEEE Congress on Evolutionary Computation, 2009. CEC’09. IEEE, pp 3256–3263Google Scholar
  57. Zhan X, Ukkusuri S V, Zhu F (2014) Inferring urban land use using large-scale social media check-in data. Netw and Spatial Economics 14(3):647–667CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2015

Authors and Affiliations

  1. 1.Telekom Innovation Laboratories and Department of Information Systems EngineeringBen Gurion University of the NegevBeer ShevaIsrael

Personalised recommendations