Skip to main content

Rough Sets for Selection of Molecular Descriptors to Predict Biological Activity of Molecules

  • Chapter
  • First Online:
Scalable Pattern Recognition Algorithms
  • 1423 Accesses

Abstract

In conventional drug design, the drug discovery proceeds largely by trial and error synthesizing thousands of molecules. Although this approach is the most effective method to discover drugs, it is very financially expensive and labor intensive.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Amat L, Besalu E, Carbo-Dorca R (2001) Identification of active molecular sites using quantum-self-similarity matrices. J Chem Inf Comput Sci 41:978–991

    Article  Google Scholar 

  2. Bajorath J, Klein TE, Lybrand TP, Novotny J (1999) Computer-aided drug discovery: from target proteins to drug candidates. Proc Pac Symp Biocomput 4:413–414

    Google Scholar 

  3. Bazan J, Skowron A, Synak P (1994) Dynamic reducts as a tool for extracting laws from decision tables. In: Ras ZW, Zemankova M (eds) Proceedings of the 8th symposium on methodologies for intelligent systems. Lecture notes in artificial intelligence, vol 869. Springer, New York, pp 346–355

    Google Scholar 

  4. Bjorvand AT, Komorowski J (1997) Practical applications of genetic algorithms for efficient reduct computation. In: Proceedings of the 15th IMACS world congress on scientific computation, modeling and applied mathematics, vol 4, pp 601–606

    Google Scholar 

  5. Bravi G, Gancia E, Mascagni P, Pegna M, Todeschini R, Zaliani A (1997) MS-WHIM: New 3D theoretical descriptors derived from molecular surface properties: a comparative 3D QSAR study in a series of steroids. J Comput Aided Mol Des 11:79–92

    Article  Google Scholar 

  6. Chen H, Zhou J, Xie G (1998) PARM: a genetic algorithm to predict bioactivity. J Chem Inf Comput Sci 38:243–250

    Article  Google Scholar 

  7. Chen KH, Ras ZW, Skowron A (1988) Attributes and rough properties in information systems. Int J Approx Reason 2:365–376

    Article  MATH  MathSciNet  Google Scholar 

  8. Chouchoulas A, Shen Q (2001) Rough set-aided keyword reduction for text categorisation. Appl Artif Intell 15(9):843–873

    Article  Google Scholar 

  9. Cornelis C, Jensen R, Martin GH, Slezak D (2010) Attribute selection with fuzzy decision reducts. Inf Sci 180:209–224

    Article  MATH  Google Scholar 

  10. Devijver PA, Kittler J (1982) Pattern recognition: a statistical approach. Prentice Hall, Englewood Cliffs

    Google Scholar 

  11. Dubois D, Prade H (1990) Rough fuzzy sets and fuzzy rough sets. Int J Gen Syst 17:191–209

    Article  MATH  Google Scholar 

  12. Dubois D, Prade H (1992) Putting fuzzy sets and rough sets together. In: Slowiniski R (ed) Intelligent decision support: handbook of applications and advances of rough sets theory. Kluwer, Dordrecht, pp 203–232

    Google Scholar 

  13. Guha R, Jurs PC (2004) Development of linear, ensemble, and nonlinear models for the prediction and interpretation of the biological activity of a set of PDGFR inhibitors. J Chem Inf Comput Sci 44:2179–2189

    Article  Google Scholar 

  14. Guha R, Jurs PC (2004) Development of QSAR models to predict and interpret the biological activity of artemisinin analogues. J Chem Inf Comput Sci 44:1440–1449

    Article  Google Scholar 

  15. Guyon I (2003) Elisseeff: an introduction to variable and feature selection. J Mach Learn Res 3:1157–1182

    MATH  Google Scholar 

  16. Hu Q, Xie Z, Yu D (2007) Hybrid attribute reduction based on a novel fuzzy-rough model and information granulation. Pattern Recogn 40:3577–3594

    Google Scholar 

  17. Hu Q, Yu D, Liu J, Wu C (2008) Neighborhood rough set based heterogeneous feature subset selection. Inf Sci 178:3577–3594

    Article  MATH  MathSciNet  Google Scholar 

  18. Hu Q, Yu D, Xie Z, Liu J (2007) Fuzzy probabilistic approximation spaces and their information measures. IEEE Trans Fuzzy Syst 14(2):191–201

    Google Scholar 

  19. Jain AN, Koile K, Chapman D (1994) Compass: predicting biological activities from molecular surface properties. Performance comparisons on a steroid benchmark. J Med Chem 37:2315–2327

    Article  Google Scholar 

  20. Jensen R, Shen Q (2004) Fuzzy-rough attribute reduction with application to web categorization. Fuzzy Sets Syst 141:469–485

    Article  MATH  MathSciNet  Google Scholar 

  21. Jensen R, Shen Q (2004) Semantics-preserving dimensionality reduction: rough and fuzzy-rough-based approach. IEEE Trans Knowl Data Eng 16(12):1457–1471

    Article  Google Scholar 

  22. Jensen R, Shen Q (2007) Fuzzy-rough sets assisted attribute selection. IEEE Trans Fuzzy Syst 15:73–89

    Article  Google Scholar 

  23. Jensen R, Shen Q (2009) New approaches to fuzzy-rough feature selection. IEEE Trans Fuzzy Syst 17(4):824–838

    Article  Google Scholar 

  24. Katritzky AR, Lobanov V, karelson M (1994) Comprehensive descriptors for structural and statistical analysis version 1.1. University of Florida, Florida

    Google Scholar 

  25. Kohavi R, John GH (1997) Wrappers for feature subset selection. Artif Intell 97(1–2):273–324

    Article  MATH  Google Scholar 

  26. Koller D, Sahami M (1996) Toward optimal feature selection. In: Proceedings of the international conference on machine learning, pp 284–292

    Google Scholar 

  27. Komorowski J, Pawlak Z, Polkowski L, Skowron A (1999) Rough sets: a tutorial. In: Pal SK, Skowron A (eds) Rough-fuzzy hybridization: a new trend in decision making. Springer, Singapore, pp 3–98

    Google Scholar 

  28. Kumar M, Thurow K, Stoll N, Stoll R (2007) Robust fuzzy mappings for QSAR studies. Eur J Med Chem 42:675–685

    Article  Google Scholar 

  29. Leach AR (2001) Molecular modelling: principles and applications, vol 2. Prentice Hall, Reading

    Google Scholar 

  30. Leardi R, Gonzalez AL (1998) Genetic algorithms applied to feature selection in PLS regression: How and when to use them. Chemometr Intell Lab Syst 41:195–207

    Article  Google Scholar 

  31. Li ZR, Han LY, Xue Y, Yap CW, Li H, Jiang L, Chen YZ (2007) MODEL—molecular descriptor lab: a web-based server for computing structural and physicochemical features of compounds. Biotechnol Bioeng 97:96–389

    Google Scholar 

  32. Lin TY (2001) Granulation and nearest neighborhoods: rough set approach. In: Pedrycz W (ed) Granular computing: an emerging paradigm. Physica-Verlag, Heidelberg, pp 125–142

    Google Scholar 

  33. Liu SS, Yin CS, Li ZL, Cai SX (2001) QSAR study of steroid benchmark and dipeptides based on MEDV-13. J Chem Inf Comput Sci 41:321–329

    Article  Google Scholar 

  34. Maji P (2009) \(f\)-Information measures for efficient selection of discriminative genes from microarray data. IEEE Trans Biomed Eng 56(4):1063–1069

    Article  MathSciNet  Google Scholar 

  35. Maji P, Garai P (2013) On fuzzy-rough attribute selection: criteria of max-dependency, max-relevance, min-redundancy, and max-significance. Appl Soft Comput 13(9):3968–3980

    Article  Google Scholar 

  36. Maji P, Pal SK (2010) Feature selection using \(f\)-information measures in fuzzy approximation spaces. IEEE Trans Knowl Data Eng 22(6):854–867

    Article  Google Scholar 

  37. Maji P, Paul S (2010) Rough sets for selection of molecular descriptors to predict biological activity of molecules. IEEE Trans Syst Man Cybern Part C Appl Rev 40(6):639–648

    Article  Google Scholar 

  38. Modrzejewski M (1993) Feature selection using rough sets theory. In: Proceedings of the 11th international conference on machine learning, pp 213–226

    Google Scholar 

  39. Neagu CDN, Aptula AO, Gini G (2002) Neural and neuro-fuzzy models of toxic action of phenols. In: Proceedings of the 1st international IEEE symposium on intelligent systems, vol 1, pp 283–288

    Google Scholar 

  40. Ozdemir M, Embrechts MJ, Arciniegas F, Breneman CM, Lockwood L, Bennett KP (2001) Feature selection for in-silico drug design using genetic algorithms and neural networks. In: Proceedings of IEEE mountain workshop on soft computing in industrial applications, pp 25–27

    Google Scholar 

  41. Parthalain N, Shen Q, Jensen R (2010) A distance measure approach to exploring the rough set boundary region for attribute reduction. IEEE Trans Knowl Data Eng 22(3):305–317

    Article  Google Scholar 

  42. Pawlak Z (1991) Rough sets: theoretical aspects of resoning about data. Kluwer, Dordrecht

    Google Scholar 

  43. Polanski J, Walczak B (2000) The comparative molecular surface analysis (COMSA): a novel tool for molecular design. Comput Chem 24:615–625

    Article  Google Scholar 

  44. Quinlan JR (1986) Induction of decision trees. Mach Learn 1:81–106

    Google Scholar 

  45. Quinlan JR (1993) C4.5: programs for machine learning. Morgan Kaufmann, Mountain View

    Google Scholar 

  46. Quinlan JR (1996) Improved use of continuous attributes in C4.5. J Artif Intell Res 4:77–90

    MATH  Google Scholar 

  47. Robert D, Amat L, Carbo-Dorca R (1999) Three-dimensional quantitative structure-activity relationships from tuned molecular quantum similarity measures: prediction of the corticosteroid-binding globulin binding affinity for a steroid family. J Chem Inf Comput Sci 39:333–344

    Article  Google Scholar 

  48. Robinson D, Winn P, Lyne P, Richards W (1999) Self-organizing molecular field analysis: a tool for structure-activity studies. J Med Chem 42:573–583

    Article  Google Scholar 

  49. Shen Q, Chouchoulas A (1999) Combining rough sets and data-driven fuzzy learning for generation of classification rules. Pattern Recogn 32(12):2073–2076

    Article  Google Scholar 

  50. Skowron A, Rauszer C (1992) The discernibility matrices and functions in information systems. In: Slowinski R (ed) Intelligent decision support. Kluwer, Dordrecht, pp 331–362

    Google Scholar 

  51. Skowron A, Swiniarski RW, Synak P (2005) Approximation spaces and information granulation. LNCS Trans Rough Sets 3:175–189

    Google Scholar 

  52. Slezak D (1996) Approximate reducts in decision tables. In: Proceedings of the 6th international conference on information processing and management of uncertainty in knowledge-based systems, pp 1159–1164

    Google Scholar 

  53. Sventik V, Wang T, Tong C, Liaw A, Sheridan RP, Song Q (2005) Boosting: an ensemble learning tool for compound classification and QSAR modeling. J Chem Inf Model 45(3):786–799

    Article  Google Scholar 

  54. Tetkoa IV, Gasteiger J, Todeschini R, Mauri A, Livingstone D, Ertl P, Palyulin VA, Radchenko EV, Zefirov NS, Makarenko AS, Tanchuk VY, Prokopenko VV (2005) Virtual computational chemistry laboratory design and description. J Comput Aided Mol Des 19(6):453–463

    Article  Google Scholar 

  55. Tsang ECC, Chen D, Yeung DS, Wang XZ, Lee J (2008) Attributes reduction using fuzzy rough sets. IEEE Trans Fuzzy Syst 16(5):1130–1141

    Article  Google Scholar 

  56. Tuppurainen K, Viisas M, Laatikainen R, Peräkylä M (2002) Evaluation of a novel electronic eigenvalue (EEVA) molecular descriptor for QSAR/QSPR studies: validation using a benchmark steroid data set. J Chem Inf Comput Sci 42(3):607–613

    Article  Google Scholar 

  57. Turner DB, Willett P, Ferguson AM, Heritage TW (1999) Evaluation of a novel molecular vibration-based descriptor (EVA) for QSAR studies: 2. model validation using a benchmark steroid dataset. J Comput Aided Mol Des 13(3):271–296

    Article  Google Scholar 

  58. Uddameri V, Kuchanur M (2004) Fuzzy QSARs for predicting log \(K_{oc}\) of persistent organic pollutants. Chemosphere 54(6):771–776

    Article  Google Scholar 

  59. Vapnik V (1995) The nature of statistical learning theory. Springer-Verlag, New York

    Google Scholar 

  60. Wroblewski J (1995) Finding minimal reducts using genetic algorithms. In: Proceedings of the 2nd annual joint conference on information sciences, pp 186–189

    Google Scholar 

  61. Wu H, Wu Y, Luo J (2009) An interval type-2 fuzzy rough set model for attribute reduction. IEEE Trans Fuzzy Syst 17(2):301–315

    Google Scholar 

  62. Yamaguchi D (2009) Attribute dependency functions considering data efficiency. Int J Approximate Reasoning 51:89–98

    Article  Google Scholar 

  63. Zhong N, Dong J, Ohsuga S (2001) Using rough sets with heuristics for feature selection. J Intell Inf Syst 16:199–214

    Article  MATH  Google Scholar 

  64. Zhou YP, Cai CB, Huan S, Jiang JH, Wu HL, Shen GL, Yu RQ (2007) QSAR study of angiotensin II antagonists using robust boosting partial least squares regression. Anal Chim Acta 593:68–74

    Article  Google Scholar 

  65. Ziarko W (1993) Variable precision rough set model. J Comput Syst Sci 46:39–59

    Article  MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Pradipta Maji .

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Maji, P., Paul, S. (2014). Rough Sets for Selection of Molecular Descriptors to Predict Biological Activity of Molecules. In: Scalable Pattern Recognition Algorithms. Springer, Cham. https://doi.org/10.1007/978-3-319-05630-2_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-05630-2_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-05629-6

  • Online ISBN: 978-3-319-05630-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics