Skip to main content

Answering Why-Not Questions on Structural Graph Clustering

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10827))

Abstract

Structural graph clustering is one fundamental problem in managing and analyzing graph data. As a fast and exact density based graph clustering algorithm, pSCAN is widely used to discover meaningful clusters in many different graph applications. The problem of explaining why-not questions on pSCAN is to find why an expected vertex is not included in the specified cluster of the pSCAN results. Obviously, the pSCAN results are sensitive to two parameters: (i) the similarity threshold \(\epsilon \); and (ii) the density constraint \(\mu \), when them are not set good enough, some expected vertices would be missing in the specified clusters. To tackle this problem, we firstly analyze that how the parameters affect the results of pSCAN, then we propose two novel explanation algorithms to explain why-not questions on pSCAN by offering some advices on how to refine the initial pSCAN with minimum penalty from two perspectives: (i) modifying the parameter \(\epsilon \); and (ii) modifying the parameter \(\mu \). Moreover, we present some constraints to ensure the original pSCAN results are retained as much as possible in the results of refined pSCAN. Finally, we conduct comprehensive experimental studies, which show that our approaches can efficiently return high-quality explanations for why-not questions on pSCAN.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    http://snap.stanford.edu/data/.

References

  1. Chapman, A., Jagadish, H.V.: Why not‘?’. In: SIGMOD, pp. 523–534 (2009)

    Google Scholar 

  2. Chang, L., Li, W., Lin, X., Qin, L., Zhang, W.: pSCAN: fast and exact structural graph clustering. In: ICDE, pp. 253–264 (2016)

    Google Scholar 

  3. Xu, X., Yuruk, N., Feng, Z., Schweiger, T.A.J.: SCAN: a structural clustering algorithm for networks. In: KDD, pp. 824–833 (2007)

    Google Scholar 

  4. Shiokawa, H., Fujiwara, Y., Onizuka, M.: SCAN++: efficient algorithm for finding clusters, hubs and outliers on large-scale graphs. In: PVLDB, pp. 1178–1189 (2015)

    Google Scholar 

  5. Huang, J., Chen, T., Doan, A., Naughton. J.F.: On the provenance of non-answers to queries over extracted data. In: PVLDB, pp. 736–747 (2008)

    Article  Google Scholar 

  6. Zong, C., Yang, X., Wang, B., Liu, C.: Minimal explanations of missing values by chasing acquisitional data. WWWJ 20, 1333–1362 (2017). https://doi.org/10.1007/s11280-017-0438-0

    Article  Google Scholar 

  7. Zong, C., Yang, X., Wang, B., Zhang, J.: Minimizing explanations for missing answers to queries on databases. In: Meng, W., Feng, L., Bressan, S., Winiwarter, W., Song, W. (eds.) DASFAA 2013. LNCS, vol. 7825, pp. 254–268. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-37487-6_21

    Chapter  Google Scholar 

  8. Herschel, M., Hernández, M.A., Tan, W.C.: Artemis: a system for analyzing missing answers. In: PVLDB, pp. 1550–1553 (2009)

    Google Scholar 

  9. Herschel, M., Hernández, M.A.: Explaining missing answers to SPJUA queries. In: PVLDB, pp. 185–196 (2010)

    Article  Google Scholar 

  10. Tran, Q.T., Chan, C.Y.: How to ConQueR why-not questions. In: SIGMOD, pp. 15–26 (2010)

    Google Scholar 

  11. He, Z., Lo, E.: Answering why-not questions on top-k queries. In: ICDE, pp. 750–761 (2012)

    Google Scholar 

  12. Liu, Q., Gao, Y., Chen, G., Zheng, B., Zhou, L.: Answering why-not and why questions on reverse top-k queries. VLDB J. 25, 867–892 (2016)

    Article  Google Scholar 

  13. Chen, L., Lin, X., Hu, H., Jensen, C.S., Xu, J.L: Answering why-not spatial keyword top-k queries via keyword adaption. In: ICDE, pp. 697–708 (2016)

    Google Scholar 

  14. Islam, M.S., Zhou, R., Liu, C.: On answering why-not questions in reverse skyline queries. In: ICDE, pp. 973–984 (2013)

    Google Scholar 

  15. Islam, M.S., Liu, C., Li, J.: Efficient answering of why-not questions in similar graph matching. TKDE 27, 2672–2686 (2015)

    Google Scholar 

  16. Roy, S., Suciu, D.: A formal approach to finding explanations for database queries. In: SIGMOD, pp. 1579–1590 (2014)

    Google Scholar 

Download references

Acknowledgements

The work is supported by the National Natural Science Foundation of China (Nos. U1736104, 61572122, 61532021, 61502317, 61502316, 61702344).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chuanyu Zong .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zong, C. et al. (2018). Answering Why-Not Questions on Structural Graph Clustering. In: Pei, J., Manolopoulos, Y., Sadiq, S., Li, J. (eds) Database Systems for Advanced Applications. DASFAA 2018. Lecture Notes in Computer Science(), vol 10827. Springer, Cham. https://doi.org/10.1007/978-3-319-91452-7_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-91452-7_17

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-91451-0

  • Online ISBN: 978-3-319-91452-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics