# On solving the multiple p-median problem based on biclustering

## Abstract

In this paper, we discuss the multiple p-median problem (MPMP), an extension of the original p-median problem and present several potential applications. The objective of the well-known p-median problem is to locate p facilities in order to minimize the total distance between demand points and facilities. Each demand point should be covered by its closest facility. In the MPMP, each demand point should be covered by more than one facilities closer to it, represented in total by the mc parameter. The MPMP can be applied to various location problems, e.g. the provision of emergency services where alternative facilities need to hedge against the unavailability of the primary facility, as well as to other domains, e.g. recommender systems where it may be desirable to respond to each user query with more than one available choice that satisfy their preferences. We efficiently solve the MPMP by using a biclustering heuristic, which creates biclusters from the distance matrix. In the proposed approach, a bicluster represents a subset of demand points covered by a subset of facilities. The heuristic selects appropriate biclusters taking into account the objective of the problem. Based on experimental tests performed in known benchmark problems, we observed that our method provides solutions slightly inferior to the optimal ones in significantly less computational time when compared to the CPLEX optimizer. In larger test instances, our method outperforms CPLEX both in terms of computational time and solution quality, when a time bound of 1 h is set for obtaining a solution.

## Keywords

Location p-median problem Data mining Biclustering Artificial intelligence## Notes

### Acknowledgements

The authors are grateful to the anonymous reviewers for their constructive comments and suggestions because through the detailed evaluation of this work, they contributed to its significant improvement. Also, the authors wish to thank Aristotelis Kompothrekas, Ph.D. candidate in Patras University, for his valuable contribution in performing the experimental tests.

## References

- Agrawal R, Mannila H, Srikant R, Toivonen H, Verkamo AI et al (1996) Fast discovery of association rules. Adv Knowl Discov Data Min 12(1):307–328Google Scholar
- Al-khedhairi A (2008) Simulated annealing metaheuristic for solving p-median problem. Int J Contemp Math Sci 3(28):1357–1365Google Scholar
- Alp O, Erkut E, Drezner Z (2003) An efficient genetic algorithm for the p-median problem. Ann Oper Res 122(1–4):21–42Google Scholar
- Anaya-Arenas AM, Renaud J, Ruiz A (2014) Relief distribution networks: a systematic review. Ann Oper Res 223(1):53–79Google Scholar
- Avella P, Sassano A, Vasil’ev I (2007) Computational study of large-scale p-median problems. Math Program 109(1):89–114Google Scholar
- Avella P, Boccia M, Salerno S, Vasilyev I (2012) An aggregation heuristic for large scale p-median problem. Comput Oper Res 39(7):1625–1632Google Scholar
- Beasley JE (1990) Or-library: distributing test problems by electronic mail. J Oper Res Soc 41(11):1069–1072Google Scholar
- Ben-Dor A, Chor B, Karp R, Yakhini Z (2003) Discovering local structure in gene expression data: the order-preserving submatrix problem. J Comput Biol 10(3–4):373–384Google Scholar
- Bergmann S, Ihmels J, Barkai N (2003) Iterative signature algorithm for the analysis of large-scale gene expression data. Phys Rev E 67(3):031902Google Scholar
- Berkhin P (2006) A survey of clustering data mining techniques. In: Kogan J, Nicholas C, Teboulle M (eds) Grouping multidimensional data. Springer, Berlin, pp 25–71Google Scholar
- Boutsinas B (2013a) Machine-part cell formation using biclustering. Eur Journal Oper Res 230(3):563–572Google Scholar
- Boutsinas B (2013b) A new biclustering algorithm based on association rule mining. Int J Artif Intell Tools 22(03):1350017Google Scholar
- Boutsinas B, Siotos C, Gerolimatos A (2008) Distributed mining of association rules based on reducing the support threshold. Int J Artif Intell Tools 17(06):1109–1129Google Scholar
- Bozkaya B, Zhang J, Erkut E (2002) An efficient genetic algorithm for the p-median problem. In: Drezner Z, Hamacher H (eds) Facility location: applications and theory. Springer, Berlin, pp 179–205Google Scholar
- Busygin S, Prokopyev O, Pardalos PM (2008) Biclustering in data mining. Comput Oper Res 35(9):2964–2987Google Scholar
- Chardaire P, Lutton JL (1993) Using simulated annealing to solve concentrator location problems in telecommunication networks. In: Vidal RVV (ed) Applied simulated annealing. Springer, Berlin, pp 175–199Google Scholar
- Cheng Y, Church G (2000) Biclustering of expression data. In: proceedings of the eighth international conference on intelligent systems for molecular biology (ismb)Google Scholar
- Chiyoshi F, Galvao RD (2000) A statistical analysis of simulated annealing applied to the p-median problem. Ann Oper Res 96(1–4):61–74Google Scholar
- Church RL, ReVelle CS (1976) Theoretical and computational links between the p-median, location set-covering, and the maximal covering location problem. Geogr Anal 8(4):406–415Google Scholar
- Daskin MS, Maass KL (2015) The p-median problem. In: Laporte G, Nickel S, Saldanha da Gama F (eds) Location science. Springer, Cham, pp 21–45Google Scholar
- Densham PJ, Rushton G (1992) A more efficient heuristic for solving large p-median problems. Pap Reg Sci 71(3):307–329Google Scholar
- Dhillon IS (2001) Co-clustering documents and words using bipartite spectral graph partitioning. In: Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 269–274Google Scholar
- Drezner Z, Hamacher HW (2001) Facility location: applications and theory. Springer, BerlinGoogle Scholar
- Erkut E, Myroon T, Strangway K (2000) Transalta redesigns its service-delivery network. Interfaces 30(2):54–69Google Scholar
- Fitzsimmons JA, Allen LA (1983) A warehouse location model helps texas comptroller select out-of-state audit offices. Interfaces 13(5):40–46Google Scholar
- Floyd RW (1962) Algorithm 97: shortest path. Commun ACM 5(6):345Google Scholar
- García S, Labbé M, Marín A (2011) Solving large p-median problems with a radius formulation. INFORMS J Comput 23(4):546–556Google Scholar
- Hakimi SL (1964) Optimum locations of switching centers and the absolute centers and medians of a graph. Oper Res 12(3):450–459Google Scholar
- Hakimi SL (1965) Optimum distribution of switching centers in a communication network and some related graph theoretic problems. Oper Res 13(3):462–475Google Scholar
- Han J, Pei J, Yin Y, Mao R (2004) Mining frequent patterns without candidate generation: a frequent-pattern tree approach. Data Min Knowl Discov 8(1):53–87Google Scholar
- Hanjoul P, Peeters D (1985) A comparison of two dual-based procedures for solving the p-median problem. Eur J Oper Res 20(3):387–396Google Scholar
- Hansen P, Mladenović N (1997) Variable neighborhood search for the p-median. Location Sci 5(4):207–226Google Scholar
- Hartigan JA (1972) Direct clustering of a data matrix. J Am Stat Assoc 67(337):123–129Google Scholar
- Honey R, Rushton G, Lolonis P, Dalziel B, Armstrong M, De S, Densham P (1991) Stages in the adoption of a spatial decision support system for reorganizing service delivery regions. Environ Plan C Gov Policy 9(1):51–63Google Scholar
- Karatas M, Razi N, Tozan H (2016) A comparison of p-median and maximal coverage location models with q-coverage requirement. Proc Eng 149:169–176Google Scholar
- Kariv O, Hakimi SL (1979) An algorithmic approach to network location problems. i: the p-centers. SIAM J Appl Math 37(3):513–538Google Scholar
- Klastorin TD (1985) The p-median problem for cluster analysis: a comparative test using the mixture model approach. Manag Sci 31(1):84–95Google Scholar
- Lazzeroni L, Owen A (2002) Plaid models for gene expression data. Stat Sin 12:61–86Google Scholar
- Liu J, Wang W (2003) Op-cluster: clustering by tendency in high dimensional space. In: Third IEEE international conference on data mining, 2003. ICDM 2003. IEEE, pp 187–194Google Scholar
- Madeira SC, Oliveira AL (2004) Biclustering algorithms for biological data analysis: a survey. IEEE/ACM Trans Comput Biol Bioinform (TCBB) 1(1):24–45Google Scholar
- Maranzana F (1964) On the location of supply points to minimize transport costs. J Oper Res Soc 15(3):261–270Google Scholar
- Megiddo N (1986) On the complexity of linear programming. IBM Thomas J, Watson Research DivisionGoogle Scholar
- Mladenović N, Brimberg J, Hansen P, Moreno-Pérez JA (2007) The p-median problem: a survey of metaheuristic approaches. Eur J Oper Res 179(3):927–939Google Scholar
- Mucherino A, Papajorgji P, Pardalos PM (2009) A survey of data mining techniques applied to agriculture. Oper Res 9(2):121–140Google Scholar
- Mulvey JM, Crowder HP (1979) Cluster analysis: an application of lagrangian relaxation. Manag Sci 25(4):329–340Google Scholar
- Murali T, Kasif S (2002) Extracting conserved gene expression motifs from gene expression data. In: Biocomputing 2003. World Scientific, pp 77–88Google Scholar
- Murray AT, Church RL (1996) Applying simulated annealing to location-planning models. J Heuristics 2(1):31–53Google Scholar
- Ndiaye F, Ndiaye BM, Ly I (2012) Application of the p-median problem in school allocation. Am J Oper Res 2(02):253Google Scholar
- Owen SH, Daskin MS (1998) Strategic facility location: a review. Eur J Oper Res 111(3):423–447Google Scholar
- Panteli A, Boutsinas B (2018) Improvement of similarity-diversity tradeoff in recommender systems based on a facility location model. Technical Report. http://hdl.handle.net/10889/11695
- Panteli A, Boutsinas B, Giannikos I (2014) On set covering based on biclustering. Int J Inf Technol Decis Mak 13(05):1029–1049Google Scholar
- Pensa RG, Robardet C, Boulicaut JF (2005) A bi-clustering framework for categorical data. In: European conference on principles of data mining and knowledge discovery. Springer, pp 643–650Google Scholar
- Prelić A, Bleuler S, Zimmermann P, Wille A, Bühlmann P, Gruissem W, Hennig L, Thiele L, Zitzler E (2006) A systematic comparison and evaluation of biclustering methods for gene expression data. Bioinformatics 22(9):1122–1129Google Scholar
- ReVelle CS, Eiselt HA (2005) Location analysis: a synthesis and survey. Eur J Oper Res 165(1):1–19Google Scholar
- ReVelle CS, Swain RW (1970) Central facilities location. Geogr Anal 2(1):30–42Google Scholar
- Rolland E, Schilling DA, Current JR et al (1997) An efficient tabu search procedure for the p-median problem. Eur J Oper Res 96(2):329–342Google Scholar
- Rosenwein MB (1994) Discrete location theory, edited by PB Mirchandani and RL Francis, John Wiley & Sons, New York, 1990, 555 pp. Networks 24(2):124–125Google Scholar
- Ruslim NM, Ghani NA (2006). An application of the p-median problem with uncertainty in demand in emergency medical services. In: Proceedings of the 2nd IMT-GT regional conference on mathematics, statistics and applications. http://math.usm.my/research/OnlineProc/OR06.pdf. Accessed 15 May 2017
- Snyder LV, Daskin MS (2005) Reliability models for facility location: the expected failure cost case. Transp Sci 39(3):400–416Google Scholar
- Tanay A, Sharan R, Shamir R (2002) Discovering statistically significant biclusters in gene expression data. Bioinformatics 18(suppl_1):S136–S144Google Scholar
- Tanay A, Sharan R, Shamir R (2005) Biclustering algorithms: a survey. Handb Comput Mol Biol 9(1–20):122–124Google Scholar
- Teitz MB, Bart P (1968) Heuristic methods for estimating the generalized vertex median of a weighted graph. Oper Res 16(5):955–961Google Scholar
- Ungar L, Foster DP (1998) A formal statistical approach to collaborative filtering. CONALD98Google Scholar
- Wang HL, Wu BY, Chao KM (2009) The backup 2-center and backup 2-median problems on trees. Netw Int J 51(1):39–49Google Scholar
- Willer DJ (1990) A spatial decision support system for bank location: a case study. CiteseerGoogle Scholar
- Yang J, Wang W, Wang H, Yu P (2002) d-clusters: capturing subspace correlation in a large data set. In: ICDE. IEEE, p 0517Google Scholar