Discovery of Customer Communities – Evaluation Aspects

Korczak, Jerzy; Pondel, Maciej; Sroka, Wiktor

doi:10.1007/978-3-030-43353-6_10

Part of the book series: Lecture Notes in Business Information Processing ((LNBIP,volume 380))

Included in the following conference series:

562 Accesses

Abstract

In the paper, a new multi-level hybrid method of community detection combining a density-based clustering with a label propagation method is evaluated and compared with the k-means benchmark and DBSCAN (Density-based spatial clustering of applications with noise). In spite of the sophisticated visualization methods, managers still usually find clustering results too difficult to evaluate and interpret. The article presents a set of key assessment measures that could be used to evaluate internal and external qualities of discovered clusters. The approach is validated on real life marketing database using advanced analytics platform, Upsaily.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

General correlation coefficient based agglomerative clustering

Article 02 November 2018

Novel Community Detection and Ranking Approaches for Social Network Analysis

Evaluation of Community Mining Algorithms in the Presence of Attributes

Notes

1.
Between-Cluster Dispersion can be calculated as \( BCD\left( n \right) = \mathop \sum \nolimits_{i} \overline{{c_{i} }} \cdot d^{2} (c_{i} ,c) \), where \( n \) is the number of clusters, \( d\left( {c_{i} , c} \right) \) is the distance between centroid of the cluster \( c_{i} \) and the global center of all clusters, \( \overline{{c_{i} }} \) is the number of elements in the cluster \( c_{i} \).
2.
Within-Cluster Dispersion can be calculated as \( WCD\left( n \right) = \mathop \sum \nolimits_{i} \mathop \sum \nolimits_{{x \in c_{i} }} d^{2} \left( {x, c_{i} } \right) \), where \( n \) is the number of clusters, \( x \) is an element of the cluster \( c_{i} \), \( d\left( {x, c_{i} } \right) \) is the distance between centroid of the cluster \( c_{i} \) and the element \( x \) belonging to the cluster \( c_{i} \).
3.
The code for Dunn index calculation was found on GitHub: Dunn index for clusters analysis - https://gist.github.com/douglasrizzo/cd7e792ff3a2dcaf27f6. Computing of Dunn index is relatively simple, authors verified published code before it was used in order to prove its validity.
4.
Implementation of Davies–Bouldin index was taken from Scikit learn library: https://scikit-learn.org/stable/modules/generated/sklearn.metrics.davies_bouldin_score.html.
5.
Implementation of Silhouette index was taken from Scikit learn library: https://scikit-learn.org/stable/modules/generated/sklearn.metrics.silhouette_score.html.
6.
Implementation of Calinski-Harabasz index was taken from Scikit learn library: https://scikit-learn.org/stable/modules/generated/sklearn.metrics.calinski_harabasz_score.html.

References

Wu, Z.H., et al.: Balanced multi-label propagation for overlapping community detection in social networks. J. Comput. Sci. Technol. 27(3), 468–479 (2012). https://doi.org/10.1007/s11390-012-1236-x
Article MathSciNet Google Scholar
Barber, M.J.: Modularity and community detection in bipartite networks. Phys. Rev. E 76(6), 066102 (2007). https://doi.org/10.1103/PhysRevE.76.066102
Article MathSciNet Google Scholar
Codaasco, G., Gargano, L.: Label propagation algorithm: a semi-synchronous approach. Int. J. Soc. Netw. Min. 1(1), 3–26 (2011). https://doi.org/10.1504/IJSNM.2012.045103
Article Google Scholar
Gregory, S.: Finding overlapping communities in networks by label propagation. New J. Phys. 12, 103018 (2010). https://doi.org/10.1088/1367-2630/12/10/103018
Article Google Scholar
Han, J., Li, W., Su, Z., Zhao, L., Deng, W.: Community detection by label propagation with compression of flow. e-print arXiv:161202463v1 (2016). https://doi.org/10.1140/epjb/e2016-70264-6
Liu, W., Jiang, X., Pellegrini, M., Wang X.: Discovering communities in complex networks by edge label propagation. Sci. Rep. 6 (2016). https://doi.org/10.1038/srep22470
Rossetti, G., Cazabet, R.: Community discovery in dynamic networks: a survey. arXiv:1707.03186 (2017). https://doi.org/10.1145/3172867
Article Google Scholar
Subelj, L., Bajec, M.: Group detection in complex networks: an algorithm and comparison of the state of the art. Physica A 397, 144–156 (2014). https://doi.org/10.1016/j.physa.2013.12.003
Article MathSciNet MATH Google Scholar
Aggarwal, C.C., Reddy, C.K.: Data Clustering: Algorithms and Applications. Chapman & Hall/CRC, New York (2013). ISBN 978-1466558212
Book Google Scholar
Gan, G., Ma, C., Wu, J.: Data Clustering: Theory, Algorithms, and Applications. SIAM Series (2007). https://doi.org/10.1137/1.9780898718348
Witten, I.H., et al.: Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann, Burlington (2016)
MATH Google Scholar
Pondel, M., Korczak, J.: Recommendations based on collective intelligence – case of customer segmentation. In: Ziemba, E. (ed.) AITM/ISM 2018. LNBIP, vol. 346, pp. 73–92. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-15154-6_5
Chapter Google Scholar
Raghavan, U.N., Albert, R., Kumara, S.: Near linear time algorithm to detect community structures in large-scale networks. Phys. Rev. E 76, 036106 (2007). https://doi.org/10.1103/PhysRevE.76.036106
Article Google Scholar
Rosvall, M., Bergstorm, C.T.: An information-theoretic framework for resolving community structure in complex networks. Proc. Natl. Acad. Sci. 104, 7327–7331 (2007). https://doi.org/10.1073/pnas.0611034104
Article Google Scholar
Xie, J.R., Szymanski, B.K.: LabelRank: a stabilized label propagation algorithm for community detection in networks. In: Proceedings of the IEEE, Network Science Workshop, pp. 386–399 (2014). https://doi.org/10.1109/NSW.2013.6609210
Korczak, J., Pondel, M.: Kolektywna klasteryzacja danych marketingowych - System rekomendacji UPSAILY. Przegląd Organizacji 1, 42–52 (2019)
Article Google Scholar
Applebaum, W.: Studying customer behavior in retail stores. J. Mark. 16(2), 172–178 (1951). https://doi.org/10.2307/1247625
Article Google Scholar
See-To, E., Ngai, E.: An empirical study of payment technologies, the psychology of consumption, and spending behavior in a retailing context. Inf. Manag. 56(3), 329–342 (2019). https://doi.org/10.1016/j.im.2018.07.007
Article Google Scholar
Korczak, J., Pondel, M., Sroka, W.: An approach to customer community discovery. In: Proceedings of Federated Conference on Computer Science and Information Systems (FedCSIS), ACSIS, vol. 18, pp. 675–683 (2019). https://doi.org/10.15439/2019F308
Rodriguez, M.Z., et al.: Clustering algorithms. A comparative approach. PLoS ONE 14(1), e0210236 (2019). https://doi.org/10.1371/journal.pone.0210236
Article Google Scholar
Abbas, O.A.: Comparisons between data clustering algorithms. Int. Arab J. Inf. Technol. 5(3), 320–325 (2008)
Google Scholar
Rossetti, G., Cazabet, R.: Community discovery in dynamic networks: a survey. Pre-print arXiv:1707.03186v2 [cs.SI] (2017). https://doi.org/10.1145/3172867
Article Google Scholar
Davies, D., Bouldin, D.: A cluster separation measure. IEEE Trans. Pattern Anal. Mach. Intell. PAMI-1(2), 224–227 (1979). https://doi.org/10.1109/TPAMI.1979.4766909
Article Google Scholar
Dunn, J.C.: Well-separated clusters and optimal fuzzy partitions. J. Cybern. 4(1), 95–104 (1974). https://doi.org/10.1080/01969727408546059
Article MathSciNet MATH Google Scholar
Calinski, T., Harabasz, J.: A dendrite method for cluster analysis. Commun. Stat. Theory Methods 3(1), 1–27 (1974). https://doi.org/10.1080/03610927408827101
Article MathSciNet MATH Google Scholar
Rousseeuw, P.J.: Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. Comput. Appl. Math. 20, 53–65 (1987). https://doi.org/10.1016/0377-0427(87)90125-7
Article MATH Google Scholar
Pondel, M., Korczak, J.: A view on the methodology of analysis and exploration of marketing data. In: Proceedings of Federated Conference on Computer Science and Information Systems (FedCSIS), pp. 1135–1143. IEEE (2017). https://doi.org/10.15439/2017F442
Schubert, E., Sander, J., Ester, M., Kriegel, H.P., Xu, X.: DBSCAN revisited, revisited: why and how you should (still) use DBSCAN. ACM Trans. Database Syst. (TODS) 42(3), 19 (2017). https://doi.org/10.1145/3068335
Article MathSciNet Google Scholar
McInnes, L., Healy, J.: UMAP: uniform manifold approximation and projection for dimension reduction. Preprint arXiv:1802.03426 (2018). https://doi.org/10.21105/joss.00861
Article Google Scholar
Newman, M.E.J.: Detecting community structure in networks. Eur. Phys. J. B 38(2), 321–330 (2004). https://doi.org/10.1140/epjb/e2004-00124-y
Article Google Scholar
Fortunato, S.: Community detection in graphs. Preprint arXiv:0906.0612 (2004). https://doi.org/10.1016/j.physrep.2009.11.002
Article MathSciNet Google Scholar
Emmons, S., Kobourov, S., Gallant, M., Börner, K.: Analysis of network clustering algorithms and cluster quality metrics at scale. PLoS ONE 11(7), e0159161 (2016). https://doi.org/10.1371/journal.pone.0159161
Article Google Scholar
Blondel, V.D., Guillaume, J.L., Lambiotte, R., Lefebvre, E.: Fast unfolding of communities in large networks. J. Stat. Mech. Theory Exp. (2008). https://doi.org/10.1088/1742-5468/2008/10/P10008
Article MATH Google Scholar
Waltman, L., Eck, N.J.: A smart local moving algorithm for large-scale modularity-based community detection. Eur. Phys. J. B 86(11), 1–14 (2013). https://doi.org/10.1140/epjb/e2013-40829-0
Article Google Scholar
Rosvall, M., Bergstrom, C.T.: Maps of random walks on complex networks re-veal community structure. Proc. Natl. Acad. Sci. 105(4), 1118–1123 (2008). https://doi.org/10.1073/pnas.0706851105
Article Google Scholar

Download references

Author information

Authors and Affiliations

International University of Logistics and Transport, ul. Sołtysowicka 19B, 51-168, Wrocław, Poland
Jerzy Korczak
Wroclaw University of Economics and Business, ul. Komandorska 118-120, 53-345, Wrocław, Poland
Maciej Pondel & Wiktor Sroka

Authors

Jerzy Korczak
View author publications
You can also search for this author in PubMed Google Scholar
Maciej Pondel
View author publications
You can also search for this author in PubMed Google Scholar
Wiktor Sroka
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Jerzy Korczak , Maciej Pondel or Wiktor Sroka .

Editor information

Editors and Affiliations

University of Economics in Katowice, Katowice, Poland
Ewa Ziemba

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Korczak, J., Pondel, M., Sroka, W. (2020). Discovery of Customer Communities – Evaluation Aspects. In: Ziemba, E. (eds) Information Technology for Management: Current Research and Future Directions. AITM ISM 2019 2019. Lecture Notes in Business Information Processing, vol 380. Springer, Cham. https://doi.org/10.1007/978-3-030-43353-6_10

Download citation

DOI: https://doi.org/10.1007/978-3-030-43353-6_10
Published: 11 March 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-43352-9
Online ISBN: 978-3-030-43353-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Discovery of Customer Communities – Evaluation Aspects

Abstract

Access this chapter

Similar content being viewed by others

General correlation coefficient based agglomerative clustering

Novel Community Detection and Ranking Approaches for Social Network Analysis

Evaluation of Community Mining Algorithms in the Presence of Attributes

Notes

References

Author information

Authors and Affiliations

Corresponding authors

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Discovery of Customer Communities – Evaluation Aspects

Abstract

Access this chapter

Similar content being viewed by others

General correlation coefficient based agglomerative clustering

Novel Community Detection and Ranking Approaches for Social Network Analysis

Evaluation of Community Mining Algorithms in the Presence of Attributes

Notes

References

Author information

Authors and Affiliations

Corresponding authors

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation