Skip to main content

Net-Ray: Visualizing and Mining Billion-Scale Graphs

  • Conference paper
Advances in Knowledge Discovery and Data Mining (PAKDD 2014)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8443))

Included in the following conference series:

Abstract

How can we visualize billion-scale graphs? How to spot outliers in such graphs quickly? Visualizing graphs is the most direct way of understanding them; however, billion-scale graphs are very difficult to visualize since the amount of information overflows the resolution of a typical screen.

In this paper we propose Net-Ray, an open-source package for visualizationbased mining on billion-scale graphs. Net-Ray visualizes graphs using the spy plot (adjacency matrix patterns), distribution plot, and correlation plot which involve careful node ordering and scaling. In addition, Net-Ray efficiently summarizes scatter clusters of graphs in a way that finds outliers automatically, and makes it easy to interpret them visually.

Extensive experiments show that Net-Ray handles very large graphs with billions of nodes and edges efficiently and effectively. Specifically, among the various datasets that we study, we visualize in multiple ways the YahooWeb graph which spans 1.4 billion webpages and 6.6 billion links, and the Twitter whofollows- whom graph, which consists of 62.5 million users and 1.8 billion edges. We report interesting clusters and outliers spotted and summarized by Net-Ray.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. http://opencloudconsortium.org

  2. Akoglu, L., Chau, D.H., Kang, U., Koutra, D., Faloutsos, C.: Opavion: mining and visualization in large graphs. In: SIGMOD (2012)

    Google Scholar 

  3. Akoglu, L., McGlohon, M., Faloutsos, C.: Oddball: Spotting anomalies in weighted graphs. In: Zaki, M.J., Yu, J.X., Ravindran, B., Pudi, V. (eds.) PAKDD 2010. LNCS, vol. 6119, pp. 410–421. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  4. Bertini, E., Santucci, G.: By chance is not enough: Preserving relative density through non uniform sampling. In: Proceedings of the Information Visualisation (2004)

    Google Scholar 

  5. Breunig, M., Kriegel, H.-P., Ng, R.T., Sander, J.: Lof: Identifying density-based local outliers. In: SIGMOD (2000)

    Google Scholar 

  6. Chakrabarti, D., Papadimitriou, S., Modha, D.S., Faloutsos, C.: Fully automatic cross-associations. In: KDD (2004)

    Google Scholar 

  7. Charikar, M., Khuller, S., Mount, D.M., Narasimhan, G.: Algorithms for facility location problems with outliers. In: SODA (2001)

    Google Scholar 

  8. Chau, D.H., Kittur, A., Hong, J.I., Faloutsos, C.: Apolo: interactive large graph sensemaking by combining machine learning and visualization. In: KDD (2011)

    Google Scholar 

  9. Elmqvist, N., Do, T.-N., Goodell, H., Henry, N., Fekete, J.: Zame: Interactive large-scale graph visualization. In: IEEE Pacific Visualization Symposium, PacificVIS 2008 (2008)

    Google Scholar 

  10. Gonzalez, T.F.: Clustering to minimize the maximum intercluster distance. Theor. Comput. Sci. 38, 293–306 (1985)

    Article  MATH  Google Scholar 

  11. Han, J., Kamber, M., Pei, J.: Data Mining: Concepts and Techniques, 3rd edn. Morgan Kaufmann Publishers Inc. (2011)

    Google Scholar 

  12. Kang, U., Chau, D.H., Faloutsos, C.: Mining large graphs: Algorithms, inference, and discoveries. In: ICDE (2011)

    Google Scholar 

  13. Kang, U., Faloutsos, C.: Beyond ‘caveman communities’: Hubs and spokes for graph compression and mining. In: ICDM (2011)

    Google Scholar 

  14. Kang, U., Meeder, B., Papalexakis, E., Faloutsos, C.: Heigen: Spectral analysis for billion-scale graphs. IEEE Transactions on Knowledge and Data Engineering 26(2), 350–362 (2014)

    Article  Google Scholar 

  15. Kang, U., Tsourakakis, C., Faloutsos, C.: Pegasus: A peta-scale graph mining system - implementation and observations. In: ICDM (2009)

    Google Scholar 

  16. Karypis, G., Kumar, V.: MeTis: Unstructured Graph Partitioning and Sparse Matrix Ordering System, Version 4.0 (2009)

    Google Scholar 

  17. Newman, M.E.J.: Power laws, pareto distributions and zipf’s law. Contemporary Physics (46), 323–351 (2005)

    Google Scholar 

  18. Ng, R.T., Han, J.: Efficient and effective clustering methods for spatial data mining. In: VLDB (1994)

    Google Scholar 

  19. Papadimitriou, S., Kitagawa, H., Gibbons, P.B., Faloutsos, C.: Loci: Fast outlier detection using the local correlation integral. In: ICDE (2003)

    Google Scholar 

  20. Pham, N., Pagh, R.: A near-linear time approximation algorithm for angle-based outlier detection in high-dimensional data. In: KDD (2012)

    Google Scholar 

  21. Shneiderman, B.: Extreme visualization: squeezing a billion records into a million pixels. In: SIGMOD (2008)

    Google Scholar 

  22. Zhang, B., Hsu, M., Dayal, U.: K-harmonic means - a spatial clustering algorithm with boosting. In: TSDM (2000)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Kang, U., Lee, JY., Koutra, D., Faloutsos, C. (2014). Net-Ray: Visualizing and Mining Billion-Scale Graphs. In: Tseng, V.S., Ho, T.B., Zhou, ZH., Chen, A.L.P., Kao, HY. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2014. Lecture Notes in Computer Science(), vol 8443. Springer, Cham. https://doi.org/10.1007/978-3-319-06608-0_29

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-06608-0_29

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-06607-3

  • Online ISBN: 978-3-319-06608-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics