Abstract
A method for feature extraction which makes use of feedforward neural networks with a single hidden layer is presented. The topology of the networks is determined by a network construction algorithm and a network pruning algorithm. Network construction is achieved by having just 1 hidden unit initially; additional units are added only when they are needed to improve the network predictive accuracy. Once a fully connected network has been constructed, irrelevant/redundant network connections are removed by pruning. The hidden unit activations of the pruned network are the features extracted from the original dataset. Using artificial datasets, we illustrate how the method works and interpret the extracted features in terms of the original attributes of the datasets. We also discuss how the feature extraction method can be used in conjunction with other learning algorithms such as decision tree methods to obtain robust and effective classifiers.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Breiman, L., Friedman, J., Olshen, R., and Stone, C. (1984). Classification and Regression Drees. Wadsworth & Brooks/Cole Advanced Books & Software.
Brodley, C. and Utgoff, P. (1995). Multivariate decision trees. Machine Learn-ing, 19:45–77.
Clark, P. and Niblett, T. (1989). The CN2 induction algorithm. Machine Learning, 3:261–283.
Fayyad, U. and Irani, K. (1993). Multi-interval discretization of continuous-valued attributes for classification learning. In Proceedings of 13th International Joint Conference on Artificial Intelligence, pages 1022–1027. Morgan Kaufmann Publishers, Inc.
Fisher, R. (1936). The use of multiple measurements in taxonomic problems. Ann. Eugenics, 7(2):179–188.
Friedman, J., Kohavi, R., and Yun, Y. (1996). Lazy decision trees. In Proceedings of the 13th National Conference on Artificial Intelligence, pages 717–724.
Kohavi, R. (1994). Bottom-up induction of oblivious read-once decision graphs.In Machine Learning: ECML-97,pages 154–169. Springer-Verlag.
Liu, H. and Motoda, H. (1998). Feature Selection for Knowledge Discovery and Data Mining. Kluwer Academic Publishers.
Liu, H. and Setiono, R. (1995). Chi2: Feature selection and discretization of numeric attributes. In Proceedings of the 7th IEEE International Conference on Tools with Artificial Intelligence.
Matheus, C. and Rendell, L. (1989). Constructive induction on decision trees. In Proceedings of IJCAI, pages 645–650.
Murthy, S., Kasif, S., Salzberg, S., and Beigel, R. (1993). OC1: Randomized induction of oblique decision trees. In Proceedings of AAAI Conference (AAAI’93), pages 322–327. AAAI Press/The MIT Press.
Oliveira, A. (1995). Inferring reduced ordered decision graphs of minimum description length. In Proceedings of the 12th International Conference on Machine Learning, pages 421–429.
Pagallo, G. and Haussler, D. (1990). Boolean feature discovery in empirical learning. Machine Learning, 5:71–99.
Quinlan, J. (1986). Induction of decision trees. Machine Learning, 1:81–106.
Quinlan, J. (1987). Generating production rules from decision trees. In Proceedings of 10th International Joint Conference on Artificial Intelligence, pages 304–307. Morgan Kaufmann.
Quinlan, J. (1993). C4.5: Programs for Machine Learning. Morgan Kaufmann.
Rendell, L. and Seshu, R. (1990). Learning hard concepts through constructive induction: Framework and rationale. Computational Intelligence, 6:247–270.
Setiono, R. (1995). A neural network construction algorrithm which maximizes the likelihood function. Connection Sccience, 7(2):147–166.
Setiono, R. (1997a). On the solution of the parity problem by a single hidden layer feedforward neural network. Neurocomputing, 6(3):225–235.
Setiono, R. (1997b). A penalty-function approach for pruning feedforward neu-ral networks. Neural Computation, 9(1):185–204.
Vilalta, R., Blix, G., and Rendell, L. (1997). Global data analysis and the fragmentation problem in decision tree induction. In van Someren, M. and Widmer, G., editors, Machine Learning: ECML-97, pages 312–326. Springer-Verlag.
Wnek, J. and Michalski, R. (1994). Hypothesis-driven constructive induction in AQ17-HCI: A method and experiments.Machine Learning, 14.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1998 Springer Science+Business Media New York
About this chapter
Cite this chapter
Setiono, R., Liu, H. (1998). Feature extraction via Neural networks. In: Liu, H., Motoda, H. (eds) Feature Extraction, Construction and Selection. The Springer International Series in Engineering and Computer Science, vol 453. Springer, Boston, MA. https://doi.org/10.1007/978-1-4615-5725-8_12
Download citation
DOI: https://doi.org/10.1007/978-1-4615-5725-8_12
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4613-7622-4
Online ISBN: 978-1-4615-5725-8
eBook Packages: Springer Book Archive