The Extraction of Information and Knowledge from Trained Neural Networks

Livingstone, David J.; Browne, Antony; Crichton, Raymond; Hudson, Brian D.; Whitley, David; Ford, Martyn G.

doi:10.1007/978-1-60327-101-1_12

The Extraction of Information and Knowledge from Trained Neural Networks

David J. Livingstone CChem FRSC³,
Antony Browne Dr.⁴,
Raymond Crichton⁵,
Brian D. Hudson BSc, PhD⁶,
David Whitley Dr.⁷ &
…
Martyn G. Ford

Protocol

5494 Accesses

Part of the book series: Methods in Molecular Biology™ ((MIMB,volume 458))

Abstract

In the past, neural networks were viewed as classification and regression systems whose internal representations were incomprehensible. It is now becoming apparent that algorithms can be designed that extract comprehensible representations from trained neural networks, enabling them to be used for data mining and knowledge discovery, that is, the discovery and explanation of previously unknown relationships present in data. This chapter reviews existing algorithms for extracting comprehensible representations from neural networks and outlines research to generalize and extend the capabilities of one of these algorithms, TREPAN. This algorithm has been generalized for application to bioinformatics data sets, including the prediction of splice site junctions in human DNA sequences, and cheminformatics. The results generated on these data sets are compared with those generated by a conventional data mining technique (C5) and appropriate conclusions are drawn.

This is a preview of subscription content, log in via an institution.

Protocol: USD 49.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Springer Nature is developing a new tool to find and evaluate Protocols. Learn more

References

Quinlan JR (1993) C4.5: programs for machine learning. Morgan Kaufmann, San Mateo, CA.
Google Scholar
Breiman L, Friedman JH, Olshen RA, Stone CJ (1984) Classification and regression trees. Wadsworth, Belmont, CA.
Google Scholar
Craven MW, Shavlik JW (1994) Using sampling and queries to extract rules from trained neural networks. In: Proc. of the 11th international conference on machine learning. Morgan Kaufmann, San Mateo, CA, pp. 37–45.
Google Scholar
Bullinaria JA (1997) Analysing the internal representations of trained neural networks. In: Browne A (ed) Neural network analysis, architectures and algorithms. Institute of Physics Press, Bristol, UK, pp. 3–26
Google Scholar
Browne A (1997) Neural network analysis, architectures and algorithms. Institute of Physics Press, Bristol, UK.
Google Scholar
Gallant SI (1998) Connectionist expert systems, Communications of the ACM 31:152–169.
Article Google Scholar
Gallant SI, Hayashi Y (1990) A neural network expert system with confidence measurements. IPMU:562–567.
Google Scholar
Saito K, Nakano R (1988) Medical diagnostic expert system based on PDP model. In: Proc. of IEEE international conf. on neural networks, pp. 255–262.
Google Scholar
Shavlik J, Towell G (1989) An approach to combining explanation-based and neural learning algorithms, Connection Science 1:233–255.
Article Google Scholar
Baba K, Enbutu I, Yoda M (1990) Explicit representation of knowledge acquired from plant historical data using neural networks. Neural Networks. 3:155–160.
Google Scholar
Bochereau L, Boutgine P (1990) Extraction of semantic features and logical rules from multilayer neural networks. In: International joint conference on neural networks, Washington, DC, vol. 2, pp. 579–582.
Google Scholar
Goh TH (1993) Semantic extraction using neural network modelling and sensitivity analysis. In Proc. international joint conf. on neural networks, Nagoya, Japan, pp.1031–1034.
Google Scholar
McMillan C, Mozer M, Smolensky P (1993) Dynamic conflict resolution in a connectionist rule-based system. In: Proc. of the 13th IJCAI, pp.1366–1371.
Google Scholar
Yeung D, Fong H (1994) Knowledge matrix: AN explanation and knowledge refinement facility for a rule induced neural network. In: Proc. 12th national conf. on artificial intelligence, vol. 2, pp. 889–894
Google Scholar
Yoon B, Lacher R (1994) Extracting rules by destructive learning. In: Neural networks, 1994. IEEE world congress on computational intelligence, pp. 1766–1771.
Google Scholar
Sethi I, Yoo J (1994) Symbolic approximation of feedforward networks. In: Gesema E, Kanal L (eds) Pattern recognition in practice, IV: multiple paradigms, comparative studies and hybrid systems. North-Holland, Amsterdam, pp. 313–324.
Google Scholar
Fletcher G, Hinde C (1995) Using neural networks as a tool for constructive rule based architectures. Knowledge Based Systems 8:183–187.
Article Google Scholar
Thrun SB (1995) Extracting rules from artificial neural networks with distributed representations. In: Tesauro G, Touretzky D, Leen T (eds) Advances in neural information processing systems MIT Press, San Mateo, CA, pp. 505–512.
Google Scholar
Benitez J, Castro J, Requina JI (1997) Are artificial neural networks black boxes? IEEE Trans Neural Networks 8:1156–1164.
Article CAS Google Scholar
Taha I, Ghosh J (1997) Evaluating and ordering of rules extracted from feedforward networks. In: Proc. IEEE international conf. on neural networks, pp. 408–413.
Google Scholar
Ampratwum CS, Picton PD, Browne A (1998) Rule extraction from neural network models of chemical species in optical emission spectra. In: Proc. workshop on recent advances in soft computing, pp. 53–64.
Google Scholar
Maire F (1999) Rule extraction by backpropagation of polyhedrons. Neural Networks, 12:717–725.
Article PubMed Google Scholar
Ishikawa M (2000) Rule extraction by successive regularization. Neural Networks 13:1171–183.
Article CAS PubMed Google Scholar
Setiono R (2000) Extracting m-of-n rules from trained neural networks. IEEE Trans Neural Networks 11:512–519.
Article CAS Google Scholar
Ultsch A, Mantyk R, Halmans G (1993) Connectionist knowledge aquisition tool CONKAT. In: Hand J (ed) Artificial intelligence frontiers in statistics AI and statistics, vol. III, Chapman and Hall, London, pp. 256–263.
Google Scholar
Giles C, Omlin C (1993) Extraction, insertion, and refinement of symbolic rules in dynamically driven recurrent networks. Connection Science 5:307–328.
Article Google Scholar
Giles C, Omlin C (1993) Rule refinement with recurrent neural networks. In: Proc. IEEE international conf. on neural networks, pp. 801–806.
Google Scholar
McGarry K, Wermter S, MacIntyre J (1999) Knowledge extraction from radial basis function networks and multi layer perceptrons. In: Proc. international joint conf. on neural networks (Washington, DC), pp. 2494–2497.
Google Scholar
Andrews R, Tickle AB, Golea M, Diederich J (1997) Rule extraction from trained artificial neural networks. In: Browne A (ed.) Neural network analysis, architectures and algorithms. Institute of Physics Press, Bristol, UK, pp. 61–100.
Google Scholar
Tickle, A, Maire, F, Bologna, G, Andrews, R, Diederich J (2000) Lessons from past, current issues, and future research directions in extracting knowledge embedded in artificial neural networks. In: Wermter S, Sun R (eds) Hybrid neural systems. Springer-Verlag, Berlin, pp. 226–239.
Chapter Google Scholar
Shavlik J (1994) Combining symbolic and neural learning. Machine Learning 14:321–331
Google Scholar
Bologna G (2000) Rule extraction from a multilayer perceptron with staircase activation functions. In: Proc. international joint conf. on neural networks, Como, Italy, pp. 419–424.
Google Scholar
Craven MW, Shavlik JW (1997) Understanding time series networks. Int J Neural Syst 8:373–384
Article CAS PubMed Google Scholar
Browne A (1998) Detecting systematic structure in distributed representations. Neural Networks 11:815–824.
Article PubMed Google Scholar
Browne A, Picton P (1999) Two analysis techniques for feed-forward networks. Behaviormetrika 26:75–87.
Article Google Scholar
Hayashi Y (1991) A neural expert system with automated extraction of fuzzy if-then rules and its application to medical diagnosis. In: Lippmann R, Moody J, Touretzky D (eds) Advances in neural information processing systems, vol. 3. Morgan Kaufmann, San Mateo, CA.
Google Scholar
Halgamuge S.K, Glesner M (1994) Neural networks in designing fuzzy systems for real world applications. Fuzzy Sets and Systems 65:1–12.
Article Google Scholar
Carpenter G, Tan, A.H. (1995) Rule extraction: From neural architecture to symbolic representation. Connect. Sci 7:3–27.
Article Google Scholar
Mitra S, Hayashi Y (2000) Neuro-fuzzy rule generation: survey in a soft computing framework. IEEE Trans Neural Networks 11:748–768.
Article CAS Google Scholar
Sun R, Peterson T (1998) Autonomous learning of sequential tasks: experiments and analyses. IEEE Trans Neural Networks 9:1217–1234.
Article CAS Google Scholar
Towell G, Shavlik JW (1993) The extraction of refined rules from knowledge based neural networks. Machine Learning 31:71–101.
Google Scholar
Fu L (1994) Rule generation from neural networks. IEEE Trans Systems, Man and Cybernetics, 24:1114–1124.
Article Google Scholar
Thrun SB (1994) Extracting provably correct rules from neural networks. Technical report IAI-TR-93–5, Institut fur Informatik III Universitat Bonn.
Google Scholar
Craven MW, Shavlik JW (1997) Understanding time series networks. Int J Neural Syst 8:373–384.
Article CAS PubMed Google Scholar
Matlab. The Mathworks Inc., Natick, MA, www.mathworks.com/products/matlab.
Nabney IT (2002) NETLAB: algorithms for pattern recognition. Springer, Heidelberg, www.ncrg.aston.ac.uk/netlab.
Rabiner LR (1989) A tutorial on hidden Markov models and selected applications in speech recognition. Proc. IEEE 77:257–286.
Article Google Scholar
Vapnik V (1995) The nature of statistical learning theory. Springer, New York.
Google Scholar
Browne A, Hudson BD, Whitley DC,Ford MG, Picton P (2004) Biological data mining with neural networks: implementation and application of a flexible decision tree extraction algorithm to genomic domain problems. Neurocomputing 57:275–293.
Article Google Scholar
Rupniak, NM, Kramer MS (1999) Discovery of the antidepressant and anti-emetic efficacy of substance P receptor (NK1) antagonists. Trends Pharmacol Sci 20:485–490.
Article CAS PubMed Google Scholar
Wang, J.X, DiPasquale, A.J, Bray, A.M, Maeji N.J, Geysen, H.M. (1993) Study of stereo-requirements of Substance P binding to NK1 receptors using analogues with systematic D-amino acid replacements. Biorg. Med. Chem. Lett., 3:451–456.
Article Google Scholar
Kulp D, Haussler D, Reese MG, Eeckman FH (1996) A generalized hidden Markov model for the recognition of human genes in DNA. In: Proc. ISMB-96. AAAI/MIT Press, St. Louis, pp. 134–142.
Google Scholar
Salzberg S, Chen X, Henderson J, Fasman K (1996) Finding genes in DNA using decision trees and dynamic programming. In: Proc. ISMB-96. AAAI/MIT Press, St. Louis, pp. 201–210.
Google Scholar
Yada T, Hirosawa M (1996) Gene recognition in cyanobacterium genomic sequence data using the hidden Markov model. In: Proc. ISMB-96. AAAI/MIT Press, St. Louis, pp. 252–260.
Google Scholar
Ying X, Uberbacher EC (1996) Gene prediction by pattern recognition and homology search. In: Proc. ISMB-96. AAAI/MIT Press, St. Louis, pp. 241–251.
Google Scholar
Burset, M, Guigo R (1996) Evaluation of gene structure prediction programs. Genomics 34:353–367.
Article CAS PubMed Google Scholar
Thanaraj TA (1999) A clean data set of EST-confirmed splice sites from Homo sapiens and standards for clean-up procedures. Nucleic Acids Res 27:2627–2637.
Article CAS PubMed Google Scholar
Thanaraj TA (2000) Positional characterization of false positives from computational prediction of human splice sites. Nucleic Acids Res 28:744–754.
Article CAS PubMed Google Scholar
Oprea TI, Davis AM, Teague SJ, Leeson PD (2001) Is there a difference between leads and drugs? A historical perspective. J Chem Inf Comp Sci 41:1308–-1315.
CAS Google Scholar
Cerius-2. MSI Inc., San Leandro, CA.
Google Scholar
Lipinski CA, Lombardo F, Dominy BW, Feeney PJ (2001) Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings. Adv Drug Deliv Res 46:3–26.
Article CAS Google Scholar
Kiralj R, Ferreira MMC (2003) A priori molecular descriptors in QSAR: a case of HIV-1 protease inhibitors, I. The chemometric approach J Mol Graph Mod 21:435–448.
Article CAS Google Scholar
Young S, Sacks S (2000) Analysis of a large, high-throughput screening data using recursive partitioning. In: Gundertofte K, Jorgensen FS (eds) Molecular modelling and prediction of biological activity. Kluwer Academic/Plenum Press, New York, pp. 149–156.
Google Scholar
Manallack DT, Tehan BG, Gancia E, Hudson BD, Ford MG, Livingstone DJ, Whitley DC, Pitt WR (2003) A consensus neural network based technique for identifying poorly soluble compounds. J Chem Inf Comput Sci 43:674–679.
CAS PubMed Google Scholar
Watson JD, Hopkins NH, Roberts JW, Argetsinger J, Weiner A (1987) Molecular biology of the gene (4th edn). Benjamin Cummings, Menlo Park, CA.
Google Scholar
Sharkey AJC, Sharkey NE, Chandroth GO (1996) Neural nets and diversity. Neural Computing and Applications 4:218–227.
Article Google Scholar
Drucker H, Schapire R, Simard P (1993) Boosting performance in neural networks. Int J Pattern Recogn 7:705–719.
Article Google Scholar
Schapire RE (1990) The strength of weak learnability. Mach Learn 5:197–227.
Google Scholar
Breiman L (1996) Bagging predictors. Mach Learn 26:123–140.
Google Scholar
Wolpert DH (1992) Stacked generalization. Neural Networks 5:241–259.
Article Google Scholar
Yang S, Browne A, Picton P (2002) Multistage neural network ensembles. In: Proc. 3rd international workshop on multiple classifier systems, lecture notes in computer science, vol. 2364. Springer, Heidelberg, pp. 91–97.
Google Scholar

Download references

Acknowledgement

This chapter is dedicated to our dear friend and colleague, Martyn, who passed away after a brave fight with cancer on June 7, 2007.

Author information

Authors and Affiliations

ChemQuest, Sandown, UK and Centre for Molecular Design, University of Portsmouth, Portsmouth, Hampshire, UK
David J. Livingstone CChem FRSC
Department of Computing, School of Engineering and Physical Sciences, University of Surrey, Guildford, Surrey, UK
Antony Browne Dr.
Centre for Molecular Design, University of Portsmouth, Portsmouth, Hampshire, UK
Raymond Crichton
Centre for Molecular Design, University of Portsmouth, Portsmouth, PO1 2DY, Hampshire, UK
Brian D. Hudson BSc, PhD
Centre for Molecular Design, University of Portsmouth, Portsmouth, Hampshire, UK
David Whitley Dr.

Authors

David J. Livingstone CChem FRSC
View author publications
You can also search for this author in PubMed Google Scholar
Antony Browne Dr.
View author publications
You can also search for this author in PubMed Google Scholar
Raymond Crichton
View author publications
You can also search for this author in PubMed Google Scholar
Brian D. Hudson BSc, PhD
View author publications
You can also search for this author in PubMed Google Scholar
David Whitley Dr.
View author publications
You can also search for this author in PubMed Google Scholar
Martyn G. Ford
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Brian D. Hudson BSc, PhD .

Editor information

Editors and Affiliations

ChemQuest, Sandown, Isle of Wight, United Kingdom, PO36 8LZ, UK
David J. Livingstone

Rights and permissions

Reprints and permissions

Copyright information

About this protocol

Cite this protocol

Livingstone, D.J., Browne, A., Crichton, R., Hudson, B.D., Whitley, D., Ford, M.G. (2008). The Extraction of Information and Knowledge from Trained Neural Networks. In: Livingstone, D.J. (eds) Artificial Neural Networks. Methods in Molecular Biology™, vol 458. Humana Press. https://doi.org/10.1007/978-1-60327-101-1_12

Download citation

DOI: https://doi.org/10.1007/978-1-60327-101-1_12
Publisher Name: Humana Press
Print ISBN: 978-1-58829-718-1
Online ISBN: 978-1-60327-101-1
eBook Packages: Springer Protocols

Publish with us

Policies and ethics