Combinatorial Optimization in Data Mining

Reference work entry


This chapter presents data mining techniques that are formulated as combinatorial optimization problems together with their applications. There are a number of cases where fundamental data mining tool is not combinatorial in nature, yet widely used special-purpose combinatorial extensions exist. For the sake of completeness, these fundamental tools are also discussed in detail before the extensions with underlying combinatorial optimization problems. A number of computationally challenging data mining algorithms that have non-convex formulations are also explored.


Feature Selection Support Vector Regression Unlabeled Data Data Mining Tool Pattern Vector 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Recommended Reading

  1. 1.
    J. Abello, M.G.C. Resende, S. Sudarsky, Massive quasi-clique detection, in LATIN 2002: Theoretical Informatics (Springer, Berlin/New York, 2002), pp. 598–612Google Scholar
  2. 2.
    S. Alexe, E. Blackstone, P. Hammer, H. Ishwaran, M. Lauer, C. Snader, Coronary risk prediction by logical analysis of data. Ann. Oper. Res. 119, 15–42 (2003)MATHGoogle Scholar
  3. 3.
    D. Aloise, A. Deshpande, P. Hansen, P. Popat, NP-hardness of Euclidean sum-of-squares clustering. Mach. Learn. 75, 245–248 (2009)Google Scholar
  4. 4.
    D. Arthur, S. Vassilvitskii, How slow is the k-means method? in Proceedings of the 22nd Annual Symposium on Computational Geometry (ACM, New York, 2006), pp. 144–153Google Scholar
  5. 5.
    B. Balasundaram, S. Butenko, I.V. Hicks, Clique relaxations in social network analysis: the maximum k-plex problem. Oper. Res. 59, 133–142 (2011)MathSciNetMATHGoogle Scholar
  6. 6.
    G.H. Ball, D.J. Hall, ISODATA, a novel method of data analysis and pattern classification. Technical report, Stanford Research Institute, Menlo Park, CA, 1965Google Scholar
  7. 7.
    A. Banerjee, S. Merugu, I.S. Dhillon, J. Ghosh, Clustering with Bregman divergences. J. Mach. Learn. Res. 6, 1705–1749 (2005)MathSciNetMATHGoogle Scholar
  8. 8.
    A. Baraldi, P. Blonda, A survey of fuzzy clustering algorithms for pattern recognition – part II. IEEE Trans. Syst. Man Cybern. B 29(6), 786–801 (1999)Google Scholar
  9. 9.
    M. Belkin, I. Matveeva, P. Niyogi, Regularization and semi-supervised learning on large graphs. Learn. Theory 3120, 624–638 (2004)MathSciNetGoogle Scholar
  10. 10.
    A. Ben-Dor, L. Bruhn, N. Friedman, I. Nachman, M. Schummer, Z. Yakhini, Tissue classification with gene expression profiles, in Proceedings of the 4th Annual International Conference on Computational Biology (RECOMB), Tokyo, 2000, pp. 54–64Google Scholar
  11. 11.
    A. Ben-Dor, N. Friedman, Z. Yakhini, Class discovery in gene expression data, in Proceedings of the 5th Annual International Conference on Computational Biology (RECOMB), New York, NY, USA (ACM, 2001), pp. 31–38Google Scholar
  12. 12.
    A. Ben-Dor, B. Chor, R. Karp, Z. Yakhini, Discovering local structure in gene expression data: the order-preserving submatrix problem. J. Comput. Biol. 10(3–4), 373–384 (2003)Google Scholar
  13. 13.
    Y. Bengio, O. Delalleau, N. Le Roux, Label propagation and quadratic criterion, in Semi Supervised Learning (MIT, Cambridge, 2006)Google Scholar
  14. 14.
    K.P. Bennett, A. Demiriz, Semi-supervised support vector machines. Adv. Neural Inf. Process. Syst. 11, 368–374 (1999)Google Scholar
  15. 15.
    C. Bergeron, F. Cheriet, J. Ronsky, R. Zernicke, H. Labelle, Prediction of anterior scoliotic spinal curve from trunk surface using support vector regression. Eng. Appl. Artif. Intell. 18(8), 973–983 (2005)Google Scholar
  16. 16.
    D. Bertsimas, R. Shioda, Classification and regression via integer optimization. Oper. Res. 55(2), 252–271 (2007)MathSciNetMATHGoogle Scholar
  17. 17.
    J.C. Bezdek, Pattern Recognition with Fuzzy Objective Function Algorithms (Kluwer Academic, Norwell, 1981)MATHGoogle Scholar
  18. 18.
    T.D. Bie, N. Cristianini, Semi-supervised learning using semi-definite programming, in Semi-Supervised Learning (MIT, Cambridge, 2006), pp. 119–135Google Scholar
  19. 19.
    C.M. Bishop, Pattern Recognition and Machine Learning (Information Science and Statistics) (Springer, New York, 2006)Google Scholar
  20. 20.
    A.L. Blum, P. Langley, Selection of relevant features and examples in machine learning. Artif. Intell. 97(1–2), 245–271 (1997)MathSciNetMATHGoogle Scholar
  21. 21.
    A. Blum, T. Mitchell, Combining labeled and unlabeled data with co-training, in Proceedings of the 11th Annual Conference on Computational Learning Theory (ACM, New York, 1998), pp. 92–100Google Scholar
  22. 22.
    V. Boginski, Network-based data mining: operations research techniques and applications, in Encyclopedia of Operations Research and Management Science (Wiley, Hoboken, 2010) pp. 3498–3508Google Scholar
  23. 23.
    P.S. Bradley, O.L. Mangasarian, Feature selection via concave minimization and support vector machines, in Proceedings of the Fifteenth International Conference on Machine Learning (ICML), Madison, 1998, pp. 82–90Google Scholar
  24. 24.
    P.S. Bradley, U.M. Fayyad, O.L. Mangasarian, Mathematical programming for data mining: formulations and challenges. INFORMS J. Comput. 11, 217–238 (1999)MathSciNetMATHGoogle Scholar
  25. 25.
    J.P. Brooks, Support vector machines with the ramp loss and the hard margin loss. Oper. Res. 59(2), 467–479 (2011)MathSciNetMATHGoogle Scholar
  26. 26.
    M. Brown, W. Grundy, D. Lin, N. Cristianini, C. Sugne, T. Furey, M. Ares, D. Haussler, Knowledge-based analysis of microarray gene expression data by using support vector machines. Proc. Natl. Acad. Sci. 97(1), 262–267 (2000)Google Scholar
  27. 27.
    K. Bryan, Biclustering of expression data using simulated annealing, in Proceedings of the 18th IEEE Symposium on Computer-Based Medical Systems (CBMS) Washington, DC, USA, 2005, pp. 383–388Google Scholar
  28. 28.
    C. Burges, T. Shaked, E. Renshaw, A. Lazier, M. Deeds, N. Hamilton, G. Hullender, Learning to rank using gradient descent, in Proceedings of the 22nd International Conference on Machine Learning, Bonn, 2005, pp. 89–96Google Scholar
  29. 29.
    S. Busygin, O.A. Prokopyev, P.M. Pardalos, Feature selection for consistent biclustering. J. Comb. Optim. 10, 7–21 (2005)MathSciNetMATHGoogle Scholar
  30. 30.
    S. Busygin, N. Boyko, P.M. Pardalos, M. Bewernitz, G. Ghacibeh, Biclustering EEG data from epileptic patients treated with vagus nerve stimulation, in Data Mining, Systems Analysis and Optimization in Biomedicine, vol. 953, ed. by O. Seref, O.E. Kundakcioglu, P.M. Pardalos (American Institute of Physics, Melville, 2007), pp. 220–231Google Scholar
  31. 31.
    S. Busygin, O. Prokopyev, P.M. Pardalos, Biclustering in data mining. Comput. Oper. Res. 35(9), 2964–2987 (2008)MathSciNetMATHGoogle Scholar
  32. 32.
    D. Casasent, X.W. Chen, Waveband selection for hyperspectral data: optimal feature selection, in Proceedings of SPIE, vol. 5106, Orlando, FL, 2003, pp. 259–270Google Scholar
  33. 33.
    W. Chaovalitwongse, Novel quadratic programming approach for time series clustering with biomedical application. J. Comb. Optim. 15, 225–241 (2008)MathSciNetMATHGoogle Scholar
  34. 34.
    O. Chapelle, Training a support vector machine in the primal. Neural Comput. 19, 1155–1178 (2007)MathSciNetMATHGoogle Scholar
  35. 35.
    O. Chapelle, A. Zien, Semi-supervised classification by low density separation, in Proceeding of International Conference on Artificial Intelligence and Statistics (AISTAT), Barbados, 2005, pp. 57–64Google Scholar
  36. 36.
    O. Chapelle, M. Chi, A. Zien, A continuation method for semi-supervised SVMs, in Proceedings of the 23rd International Conference on Machine Learning (ICML), New York, NY, USA (ACM, 2006), pp. 185–192Google Scholar
  37. 37.
    O. Chapelle, V. Sindhwani, S.S. Keerthi, Branch and bound for semi-supervised support vector machines. Adv. Neural Inform. Process. Syst. 19, 217–224 (2007)Google Scholar
  38. 38.
    O. Chapelle, V. Sindhwani, S.S. Keerthi, Optimization techniques for semi-supervised support vector machines. J. Mach. Learn. Res. 9, 203–233 (2008)MATHGoogle Scholar
  39. 39.
    X. Chen, An improved branch and bound algorithm for feature selection. Pattern Recognit. Lett. 24(12), 1925–1933 (2003)Google Scholar
  40. 40.
    Y. Cheng, G.M. Church, Biclustering of expression data, in Proceedings of the 8th International Conference on Intelligent Systems for Molecular Biology (AAAI, Menlo Park, 2000) pp. 93–103Google Scholar
  41. 41.
    H. Cheng, Z. Liu, J. Yang, Sparsity induced similarity measure for label propagation, in Proceedings of 12nd IEEE International Conference on Computer Vision, Kyoto, Japan, 2010, pp. 317–324Google Scholar
  42. 42.
    K.Y. Choy, C.W. Chan, Modeling of river discharges and rainfall using radial basis function networks based on support vector regression. Int. J. Syst. Sci. 34(14–15), 763–773 (2003)MATHGoogle Scholar
  43. 43.
    C. Cifarelli, G. Patrizi, Solving large protein folding problem by a linear complementarity algorithm with 0–1 variables. Optim. Methods Softw. 22(1), 25–49 (2007)MathSciNetMATHGoogle Scholar
  44. 44.
    R. Collobert, F. Sinz, J. Weston, L. Bottou, T. Joachims, Large scale transductive SVMs. J. Mach. Learn. Res. 7, 2006 (2006)Google Scholar
  45. 45.
    N. Cristianini, J. Shawe-Taylor, An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods (Cambridge University Press, Cambridge, 2000)Google Scholar
  46. 46.
    M. Dash, H. Liu, Feature selection for classification. Intell. Data Anal. 1(3), 131–156 (1997)Google Scholar
  47. 47.
    O. Delalleau, Y. Bengio, N. Le Roux, Efficient non-parametric function induction in semi-supervised learning, in Proceedings of the 10th International Workshop on Artificial Intelligence and Statistics (AISTAT 2005), Barbados, 2005Google Scholar
  48. 48.
    A.P. Dempster, N.M. Laird, D.B. Rubin, Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc. B 39(1), 1–38 (1977)MathSciNetMATHGoogle Scholar
  49. 49.
    I.S. Dhillon, Co-clustering documents and words using bipartite spectral graph partitioning, in Proceedings of the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), New York, NY, USA (ACM, 2001), pp. 269–274Google Scholar
  50. 50.
    I.S. Dhillon, S. Mallela, D.S. Modha, Information-theoretic co-clustering, in Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), New York, NY, USA (ACM, 2003), pp. 89–98Google Scholar
  51. 51.
    J. Doak, An evaluation of feature selection methods and their application to computer security. Technical report, University of California, 1992Google Scholar
  52. 52.
    C. Dwork, R. Kumar, M. Naor, D. Sivakumar, Rank aggregation methods for the web, in Proceedings of the 10th International Conference on World Wide Web, New York, NY, USA (ACM, 2001), pp. 613–622Google Scholar
  53. 53.
    S. Eschrich, J. Ke, L.O. Hall, D.B. Goldgof, Fast accurate fuzzy clustering through data reduction. IEEE Trans. Fuzzy Syst. 11(2), 262–270 (2003)Google Scholar
  54. 54.
    E. Forgy, Cluster analysis of multivariate data: efficiency vs. interpretability of classifications. Biometrics 21(3), 768 (1965)Google Scholar
  55. 55.
    A. Frank, D. Geiger, Z. Yakhini, A distance-based branch and bound feature selection algorithm, in Proceedings of the Nineteenth Conference Annual Conference on Uncertainty in Artificial Intelligence (UAI-03), Acapulco, 2003, pp. 241–248Google Scholar
  56. 56.
    Y. Freund, R. Iyer, R.E. Schapire, Y. Singer, An efficient boosting algorithm for combining preferences. J. Mach. Learn. Res. 4, 933–969 (2003)MathSciNetGoogle Scholar
  57. 57.
    B.J. Frey, D. Dueck, Clustering by passing messages between data points. Sci. 315(5814), 972–976 (2007)MathSciNetMATHGoogle Scholar
  58. 58.
    H.P. Friedman, J. Rubin, On some invariant criteria for grouping data. J. Am. Stat. Assoc. 62(320), 1159–1178 (1967)MathSciNetGoogle Scholar
  59. 59.
    G. Fung, O.L. Mangasarian, Semi-supervised support vector machines for unlabeled data classification. Optim. Methods Softw. 15, 29–44 (2001)MATHGoogle Scholar
  60. 60.
    G.N. Garcia, T. Ebrahimi, J.M. Vesin, Joint time-frequency-space classification of EEG in a brain-computer interface application. J. Appl. Signal Process 7, 713–729 (2003)Google Scholar
  61. 61.
    M.R. Garey, D.S. Johnson, Computers and Intractability; A Guide to the Theory of NP-Completeness (W. H. Freeman, New York, 1979)MATHGoogle Scholar
  62. 62.
    Z. Ghahramani, Unsupervised learning, in Advanced Lectures on Machine Learning (Springer, Berlin/New York, 2003), pp. 72–112Google Scholar
  63. 63.
    I.A. Gheyas, L.S. Smith, Feature subset selection in large dimensionality domains. Pattern Recognit. 43(1), 5–13 (2010)MATHGoogle Scholar
  64. 64.
    T.R. Golub, D.K. Slonim, P. Tamayo, C. Huard, M. Gaasenbeek, J.P. Mesirov, H. Coller, M.L. Loh, J.R. Downing, M.A. Caligiuri, C.D. Bloomfield, E.S. Lander, Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286(5439), 531–537 (1999)Google Scholar
  65. 65.
    Y. Grandvalet, S. Canu, Adaptive scaling for feature selection in SVMs, in NIPS, Vancouver, 2002, pp. 553–560Google Scholar
  66. 66.
    I. Guyon, A. Elisseeff, An introduction to variable and feature selection. J. Mach. Learn. Res. 3, 1157–1182 (2003)MATHGoogle Scholar
  67. 67.
    I. Guyon, J. Weston, S. Barnhill, V. Vapnik, Gene selection for cancer classification using support vector machines. Mach. Learn. 46, 389–422 (2002)MATHGoogle Scholar
  68. 68.
    Y. Hamamoto, S. Uchimura, Y. Matsuura, T. Kanaoka, S. Tomita, Evaluation of the branch and bound algorithm for feature selection. Pattern Recognit. Lett. 11(7), 453–456 (1990)MATHGoogle Scholar
  69. 69.
    J.A. Hartigan, Direct clustering of a data matrix. J. Am. Stat. Assoc. 67(337), 123–129 (1972)Google Scholar
  70. 70.
    W.C. Hong, P.F. Pai, Potential assessment of the support vector regression technique in rainfall forecasting. Water Res. Manage. 21(2), 495–513 (2007)Google Scholar
  71. 71.
    C.W. Hsu, C.C. Chang, C.J. Lin, A practical guide to support vector classification (2004),
  72. 72.
    Z. Huang, H. Chen, C.J. Hsu, W.H. Chenb, S. Wuc, Credit rating analysis with support vector machines and neural networks: a market comparative study. Decis. Support Syst. 37, 543–558 (2004)Google Scholar
  73. 73.
    K. Hyunsoo, Z.X. Jeff, M.C. Herbert, P. Haesun, A three-stage framework for gene expression data analysis by L1-norm support vector regression. Int. J. Bioinformatics Res. Appl. 1(1), 51–62 (2005)Google Scholar
  74. 74.
    A.K. Jain, Data clustering: 50 years beyond k-means. Pattern Recognit. Lett. 31(8), 651–666 (2010)Google Scholar
  75. 75.
    A.K. Jain, R.C. Dubes, Algorithms for Clustering Data (Prentice-Hall, Upper Saddle River, 1988)MATHGoogle Scholar
  76. 76.
    X. Jiang, L.H. Lim, Y. Yao, Y. Ye, Statistical ranking and combinatorial hodge theory. Mathematical Programming 127, 1–42 (2010)MathSciNetGoogle Scholar
  77. 77.
    T. Joachims, Text categorization with support vector machines: learning with many relevant features, in Proceedings of the European Conference on Machine Learning, Berlin, ed. by C. Nédellec, C. Rouveirol (Springer, 1998), pp. 137–142Google Scholar
  78. 78.
    T. Joachims, Making large–scale SVM learning practical, in Advances in Kernel Methods – Support Vector Learning, Cambridge, MA, ed. by B. Schölkopf, C.J.C. Burges, A.J. Smola (MIT, 1999), pp. 169–184Google Scholar
  79. 79.
    T. Joachims, Transductive learning via spectral graph partitioning, in Proceedings of 20th International Conference on Machine Learning (ICML), Washington, DC, USA, vol. 20, 2003, pp. 290–297Google Scholar
  80. 80.
    G.H. John, R. Kohavi, K. Pfleger, Irrelevant features and the subset selection problem, in Proceedings of the Eleventh International Conference on Machine Learning, New Brunswick, vol. 129, 1994, pp. 121–129Google Scholar
  81. 81.
    H. Kashima, J. Hu, B. Ray, M. Singh, K-means clustering of proportional data using L1 distance, in Proceedings of 19th International Conference on Pattern Recognition (ICPR), Tampa, FL, 2009, pp. 1–4Google Scholar
  82. 82.
    F. Klawonn, A. Keller, Fuzzy clustering based on modified distance measures, in IDA ’99 Proceedings of the Third International Symposium on Advances in Intelligent Data Analysis (Springer, Berlin, 1999), pp. 291–302Google Scholar
  83. 83.
    Y. Kluger, R. Basri, J.T. Chang, M. Gerstein, Spectral biclustering of microarray data: coclustering genes and conditions. Genome Res. 13(4), 703–716 (2003)Google Scholar
  84. 84.
    R. Kohavi, G.H. John, Wrappers for feature subset selection. Artif. Intell. 97(1–2), 273–324 (1997)MATHGoogle Scholar
  85. 85.
    M. Kudo, J. Sklansky, Comparison of algorithms that select features for pattern classifiers. Pattern Recognit. 33(1), 25–41 (2000)Google Scholar
  86. 86.
    O.E. Kundakcioglu, P.M. Pardalos, The complexity of feature selection for consistent biclustering, in Clustering Challenges in Biological Networks (World Scientific, Hackensack, 2009), pp. 257–266Google Scholar
  87. 87.
    O.E. Kundakcioglu, T. Ünlüyurt, Bottom-up construction of minimum-cost AND/OR trees for sequential fault diagnosis. IEEE Trans. Syst. Man Cybern. A 37(5), 621–629 (2007)Google Scholar
  88. 88.
    O.E. Kundakcioglu, O. Seref, P.M. Pardalos, Multiple instance learning via margin maximization. Appl. Numer. Math. 60(4), 358–369 (2010)MathSciNetMATHGoogle Scholar
  89. 89.
    T.N. Lal, M. Schroeder, T. Hinterberger, J. Weston, M. Bogdan, N. Birbaumer, B. Schölkopf, Support vector channel selection in BCI. IEEE Trans. Biomed. Eng. 51(6), 1003–1010 (2004)Google Scholar
  90. 90.
    P. Langley, Selection of relevant features in machine learning, in Proceedings of the AAAI Fall Symposium on Relevance (AAAI, 1994), New Orleans, LA, pp. 140–144Google Scholar
  91. 91.
    F. Lauer, G. Bloch, Incorporating prior knowledge in support vector regression. Mach. Learn. 70, 89–118 (2008)Google Scholar
  92. 92.
    S. Lee, A. Verri (eds.), Pattern Recognition with Support Vector Machines, Niagara Falls, Canada (Springer, New York/Berlin, 2002)MATHGoogle Scholar
  93. 93.
    Y. Linde, A. Buzo, R. Gray, An algorithm for vector quantizer design. IEEE Trans. Commun. 28(1), 84–95 (1980)Google Scholar
  94. 94.
    H. Liu, L. Yu, Toward integrating feature selection algorithms for classification and clustering. IEEE Trans. Knowl. Data Eng. 17(4), 491–502 (2005)Google Scholar
  95. 95.
    S. Lloyd, Least squares quantization in PCM. IEEE Trans. Inf. Theory 28, 129–137 (1982). Original paper was published as a technical note in 1957, Bell LabsMathSciNetMATHGoogle Scholar
  96. 96.
    J.B. MacQueen, Some methods for classification and analysis of multivariate observations, in Fifth Symposium on Math, Statistics and Probability (University of California Press, Berkeley, 1967), pp. 281–297Google Scholar
  97. 97.
    S. Madeira, A. Oliveira, Biclustering algorithms for biological data analysis: a survey. IEEE/ACM Trans. Comput. Biol. Bioinformatics 1, 24–45 (2004)Google Scholar
  98. 98.
    P.K. Mallapragada, R. Jin, A.K. Jain, Y. Liu, SemiBoost: boosting for semi-supervised Learning. IEEE Trans. Pattern Anal. Mach. Intell. 31(11), 2000–2014 (2009)Google Scholar
  99. 99.
    J. Mao, A.K. Jain, A self-organizing network for hyperellipsoidal clustering (HEC). IEEE Trans. Neural Netw. 7(1), 16–29 (2002)Google Scholar
  100. 100.
    G.J. McLachlan, T. Krishnan, The EM algorithm and extensions Wiley-Interscience, Hoboken, Newjersy (LibreDigital, 2008)Google Scholar
  101. 101.
    Merriam-Webster, Dictionary and Thesaurus – Merriam-Webster Online (2011),
  102. 102.
    B.G. Mirkin, Mathematical Classification and Clustering, Kluwer Academic Publishers, Dordrecht, Netherland, (Springer, 1996)MATHGoogle Scholar
  103. 103.
    A. Nahapetyan, S. Busygin, P.M. Pardalos, An improved heuristic for consistent biclustering problems, in Mathematical Modelling of Biosystems (Springer, Berlin, 2008), pp. 185–198Google Scholar
  104. 104.
    S. Nakariyakul, D.P. Casasent, Adaptive branch and bound algorithm for selecting optimal features. Pattern Recognit. Lett. 28(12), 1415–1427 (2007)Google Scholar
  105. 105.
    P.M. Narendra, K. Fukunaga, A branch and bound algorithm for feature subset selection. IEEE Transact. Comput. 100(9), 917–922 (1977)Google Scholar
  106. 106.
    W.S. Noble, Support vector machine applications in computational biology, in Kernel Methods in Computational Biology (MIT, Cambridge MA, 2004), New York, NY, pp. 71–92Google Scholar
  107. 107.
    R.F.E. Osuna, F. Girosi, An improved training algorithm for support vector machines, in IEEE Workshop on Neural Networks for Signal Processing, New York, NY, 1997, pp. 276–285Google Scholar
  108. 108.
    P.F. Pai, W.C. Hong, A recurrent support vector regression model in rainfall forecasting. Hydrol. Process. 21(6), 819–827 (2007)Google Scholar
  109. 109.
    P.M. Pardalos, E. Romeijn (eds.), Handbook of Optimization in Medicine (Springer, Newyork/London, 2009)MATHGoogle Scholar
  110. 110.
    J. Platt, Fast training of SVMs using sequential minimal optimization, in Advances in Kernel Methods: Support Vector Learning (MIT, Cambridge MA, 1999), pp. 185–208Google Scholar
  111. 111.
    M.H. Poursaeidi and O.E. Kundakcioglu, Robust support vector machines for multiple instanceclassification, Annals of Operations Research, published online. doi:10.1007/s10479-012- 1241-z M.H. Poursaeidi, O.E. Kundakcioglu, Robust support vector machines for multiple instance classification (2011, under revision)Google Scholar
  112. 112.
    G. Pyrgiotakis, O.E. Kundakcioglu, K. Finton, P.M. Pardalos, K. Powers, B.M. Moudgil, Cell death discrimination with Raman spectroscopy and support vector machines. Ann. Biomed. Eng. 37(7), 1464–1473 (2009)Google Scholar
  113. 113.
    G. Pyrgiotakis, O.E. Kundakcioglu, P.M. Pardalos, B.M. Moudgil, Raman spectroscopy and support vector machines for quick toxicological evaluation of titania nanoparticles. J. Raman Spectrosc. (2011, accepted). doi:10.1002/jrs.2839Google Scholar
  114. 114.
    M. Ris, J. Barrera, D.C. Martins Jr., U-curve: a branch-and-bound optimization algorithm for u-shaped cost functions on boolean lattices applied to the feature selection problem. Pattern Recognit. 43(3), 557–568 (2010)MATHGoogle Scholar
  115. 115.
    Y. Saeys, I. Inza, P. Larrañaga, A review of feature selection techniques in bioinformatics. Bioinformatics 23(19), 2507 (2007)Google Scholar
  116. 116.
    N.A. Sakhanenko, G.F. Luger, Shock physics data reconstruction using support vector regression. Int. J. Mod. Phys. 17(9), 1313–1325 (2006)MATHGoogle Scholar
  117. 117.
    B. Schölkopf, A.J. Smola, Learning with Kernels (MIT, Cambridge MA, 2002)Google Scholar
  118. 118.
    O. Seref, O.E. Kundakcioglu, P.M. Pardalos, Selective linear and nonlinear classification, in CRM Proceedings and Lecture Notes, vol. 45, ed. by P.M. Pardalos, P. Hansen (American Mathematical Society, Providence, 2008), pp. 211–234Google Scholar
  119. 119.
    O. Seref, O.E. Kundakcioglu, O.A. Prokopyev, P.M. Pardalos, Selective support vector machines. J. Comb. Optim. 17(1), 3–20 (2009)MathSciNetMATHGoogle Scholar
  120. 120.
    S. Shalev-Shwartz, Y. Singer, N. Srebro, A. Cotter, Pegasos: primal estimated sub-gradient solver for SVM. Math. Program. B 127, 3–30 (2011)MathSciNetMATHGoogle Scholar
  121. 121.
    J. Shawe-Taylor, N. Cristianini, Kernel Methods for Pattern Analysis (Cambridge University Press, Cambridge, 2004)Google Scholar
  122. 122.
    Q. Sheng, Y. Moreau, B. DeMoor, Biclustering microarray data by Gibbs sampling. Bioinformatics 19, 196–205 (2003)Google Scholar
  123. 123.
    H.D. Sherali, J. Desai, A global optimization RLT-based approach for solving the fuzzy clustering problem. J. Glob. Optim. 33(4), 597–615 (2005)MathSciNetMATHGoogle Scholar
  124. 124.
    Y. Shi, Y. Tian, G. Kou, Y. Peng, J. Li, Optimization Based Data Mining: Theory and Applications (Springer, New York, 2011)Google Scholar
  125. 125.
    O. Shirokikh, V. Stozhkov, V. Boginski, Combinatorial optimization techniques for network-based data mining, in Handbook of Combinatorial Optimization, 2nd Edition, (Springer, 2013)Google Scholar
  126. 126.
    W. Siedlecki, J. Sklansky, On automatic feature selection. Intern. J. Pattern Recognit. Artif. Intell. 2(2), 197–220 (1988)Google Scholar
  127. 127.
    V. Sindhwani, S.S. Keerthi, Large scale semi-supervised linear SVMs, in Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (ACM, New York, 2006), pp. 477–484Google Scholar
  128. 128.
    P. Somol, P. Pudil, J. Kittler, Fast branch & bound algorithms for optimal feature selection. IEEE Trans. Pattern Anal. Mach. Intell. 26(7), 900–912 (2004)Google Scholar
  129. 129.
    M. Song, C.M. Breneman, J. Bi, N. Sukumar, K.P. Bennett, S. Cramer, N. Tugcu, Prediction of protein retention times in anion-exchange chromatography systems using support vector regression. J. Chem. Inf. Comput. Sci. 42(6), 1347–1357 (2002)Google Scholar
  130. 130.
    I. Steinwart, Support vector machines are universally consistent. J. Complex. 18, 768–791 (2002)MathSciNetMATHGoogle Scholar
  131. 131.
    Y.F. Sun, Y.C. Liang, C.G. Wu, X.W. Yang, H.P. Lee, W.Z. Lin, Estimate of error bounds in the improved support vector regression. Prog. Nat. Sci. 14(4), 362–364 (2004)MathSciNetMATHGoogle Scholar
  132. 132.
    M. Szummer, T. Jaakkola, Partially labeled classification with Markov random walks. Adv. Neural Inf. Process. Syst. 2, 945–952 (2002)Google Scholar
  133. 133.
    J. Thorsten, Transductive inference for text classification using support vector machines, in Proceedings of 16th International Conference on Machine Learning (Morgan Kaufmann, San Francisco, 1999), pp. 200–209Google Scholar
  134. 134.
    T.B. Trafalis, H. Ince, Support vector machine for regression and applications to financial forecasting, in Proceedings of International Joint Conference on Neural Networks (IJCNN), Como, 2002Google Scholar
  135. 135.
    A.C. Trapp, O.A. Prokopyev, Solving the order-preserving submatrix problem via integer programming. INFORMS J. Comput. 22(3), 387–400 (2010)MATHGoogle Scholar
  136. 136.
    V. Vapnik, The Nature of Statistical Learning Theory (Springer, New York, 1995)MATHGoogle Scholar
  137. 137.
    V. Vapnik, A. Chervonenkis, Theory of Pattern Recognition (Naula/Moscow, Russia, 1974)MATHGoogle Scholar
  138. 138.
    V. Vapnik, A. Sterin, On structural risk minimization or overall risk in a problem of pattern recognition, in Automation and Remote Control, vol. 10, 1977, pp. 1495–1503Google Scholar
  139. 139.
    J. Wang, On transductive support vector machines, in Prediction and Discovery (American Mathematical Society, Providence, Snowbird, Utah, 2007)Google Scholar
  140. 140.
    Z. Wang, J. Yang, G. Li, An improved branch & bound algorithm in feature selection, in Proceedings of the 9th International Conference on Rough Sets, Fuzzy Sets, Data Mining, and Granular Computing, Chongqing, 2003, pp. 549–556Google Scholar
  141. 141.
    J. Weston, S. Mukherjee, O. Chapelle, M. Pontil, T. Poggio, V. Vapnik, Feature selection for SVMs, in Proceeding of NIPS, Denver, 2000, pp. 668–674Google Scholar
  142. 142.
    Z.L. Wu, C.H. Li, J.K.Y. Ng, K.R.P.H. Leung, Location estimation via support vector regression. IEEE Trans. Mob. Comput. 6(3), 311–321 (2007)Google Scholar
  143. 143.
    X.S. Xie, W.T. Liu, B.Y. Tang, Space based estimation of moisture transport in marine atmosphere using support vector regression. Remote Sens. Environ. 112(4), 1846–1855 (2008)Google Scholar
  144. 144.
    E.P. Xing, R.M. Karp, CLIFF: clustering of high-dimensional microarray data via iterative feature filtering using normalized cuts. Bioinformatics Discov. Note 17, 306–315 (2001)Google Scholar
  145. 145.
    K. Yamamoto, F. Asano, T. Yamada, N. Kitawaki, Detection of overlapping speech in meetings using support vector machines and support vector regression. IEICE Trans. Fundam. Electron. Commun. Comput. Sci. E89–A(8), 2158–2165 (2006)Google Scholar
  146. 146.
    S. Yang, P. Shi, Bidirectional automated branch and bound algorithm for feature selection. J. Shanghai Univ. (English Edition) 9(3), 244–248 (2005)Google Scholar
  147. 147.
    B. Yu, B. Yuan, A more efficient branch and bound algorithm for feature selection. Pattern Recognit. 26(6), 883–889 (1993)MathSciNetGoogle Scholar
  148. 148.
    A.L. Yuille, A. Rangarajan, The concave-convex procedure. Neural Comput. 15(4), 915–936 (2003)MATHGoogle Scholar
  149. 149.
    X. Zhu, Semi-supervised learning with graphs. PhD thesis, Carnegie Mellon University, 2005, CMU-LTI-05-192Google Scholar
  150. 150.
    X. Zhu, Semi-supervised learning literature survey (2006), Available online at
  151. 151.
    X. Zhu, Z. Ghahramani, Learning from labeled and unlabeled data with label propagation. Technical report, Citeseer, 2002Google Scholar
  152. 152.
    J. Zhu, S. Rosset, T. Hastie, R. Tibshirani, 1-norm support vector machines, in Proceedings of Advances in Neural Information Processing Systems, Vancouver, 2003Google Scholar
  153. 153.
    X. Zhu, Z. Ghahramani, J. Lafferty, Semi-supervised learning using gaussian fields and harmonic functions, in Proceedings of 21st International Conference on Machine Learning (ICML), Washington, DC, USA, vol. 20, 2003, p. 912Google Scholar
  154. 154.
    H. Zou, M. Yuan, The f -norm support vector machine. Stat. Sin. 18, 379–398 (2008)MathSciNetMATHGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2013

Authors and Affiliations

  1. 1.Department of Industrial EngineeringUniversity of HoustonHoustonTX, USA
  2. 2.Department of Industrial EngineeringUniversity of HoustonHoustonTX, USA

Personalised recommendations