Detection of interdependences in attribute selection

Lorenzo, Javier; Hernández, Mario; Méndez, Juan

doi:10.1007/BFb0094822

Detection of interdependences in attribute selection

Javier Lorenzo¹,
Mario Hernández¹ &
Juan Méndez¹

Communications
Conference paper
First Online: 19 October 2006

311 Accesses
1 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1510))

Abstract

A new measure for attribute selection, called GD, is proposed. The GD measure is based on Information Theory and allows to detect the interdependence between attributes. This measure is based on a quadratic form of the Mántaras distance and a matrix called Transinformation Matrix. In order to test the quality of the proposed measure, it is compared with other two feature selection methods, namely Mántaras distance and Relief algorithms. The comparison is done over 19 datasets along with three different induction algorithms.

Download to read the full chapter text

Chapter PDF

References

D. W. Aha and R. L. Bankert. Feature selection for case-based classification of cloud types: An empirical comparison. In Proc. of the 1994 AAAI Workshop on Case-Based Reasoning, pages 106–112. AAAI Press, 1994.
Google Scholar
D. W. Aha, Dennis Kibler, and M. K. Albert. Instance-based learning algorithms. Machine Learning, 6:37–66, 1991.
Google Scholar
H. Almuallim and T.G. Dietterich. Learning with many irrelevant features. In Proc. of the Ninth National Conference on Artificial Intelligence, pages 547–552. AAAI Press, 1991.
Google Scholar
A. L. Blum and P. Langley. Selection of relevant features and examples in machine learning. Artificial Intelligence, 97:245–271, 1997.
Article MATH MathSciNet Google Scholar
R. Caruana and D. Freitag. Greedy attribute selection. In Proc. of the 11th International Machine Learning Conference, pages 28–36, New Brunswick, NJ, 1994. Morgan Kaufmann.
Google Scholar
T. M. Cover and J. A. Thomas. Elements of Information Theory. John Wiley & Sons Inc., 1991.
Google Scholar
Walter Daelemans and Antal van den Bosch. Generalization performance of backpropagation learning on a syllabification task. In Proc. of the Third Twente Workshop on Language Technology, pages 27–38, 1992.
Google Scholar
R. Duda and P. Hart. Pattern Classification and Scene Analysis. John Willey and Sons, 1973.
Google Scholar
U. M. Fayyad and K. B. Irani. Multi-interval discretization of continuous-valued attributes for classification learning. In Proc. of the 13th Int. Joint Conference of Artificial Intelligence, pages 1022–1027, 1993.
Google Scholar
G. H. John, R. Kohavi, and K. Pfleger. Irrelevant features and the subset selection problem. In W. William and Haym Hirsh, editors, Procs. of the Eleventh International Conference on Machine Learning, pages 121–129. Morgan Kaufmann, San Francisco, CA, 1994.
Google Scholar
K. Kira and L. A. Rendell. The feature selection problem: Traditional methods and a new algorithm. In Proc. of the 10th National Conf. on Artificial Intelligence, pages 129–134, 1992.
Google Scholar
R. Kohavi and G. H. John. Wrappers for feature subset selection. Artificial Intelligence, 97(1–2):273–324, December 1997.
Article MATH Google Scholar
R. Kohavi, D. Sommerfield, and J. Dougherty. Data mining using MLC++: A machine learning library in C++. In Tools with Artificial Intelligence, pages 234–245. IEEE Computer Society Press, 1996. Received the best paper award.
Google Scholar
D. Koller and M. Sahami. Toward optimal feature selection. In Proc. of the 13th Int. Conf. on Machine Learning, pages 284–292. Morgan Kaufmann, 1996.
Google Scholar
I. Kononenko. Estimating attributes: Analysis and extensions of relief. In F. Bergadano and L. de Raedt, editors, Machine Learning: ECML-94, pages 171–182, Berlin, 1994. Springer.
Google Scholar
N. Littlestone. Learning quickly when irrelevant attributes abound: A new linearthreshold algorithm. Machine Learning, 2:285–318, 1988.
Google Scholar
R. Lopez de Mántaras. A distance-based attribute selection measure for decision tree induction. Machine Learning, 6:81–92, 1991.
Google Scholar
J. Lorenzo, M. Hernández, and J. Méndez. GD: A Measure based on Information Theory for Attribute Selection. In Helder Coelho, editor, Proc. of the 6th Ibero-American Conference on Artificial Intelligence, Lectures Notes in Artificial Intelligence, Springer Verlag, 1998.
Google Scholar
C. J. Merz and P.M. Murphy. UCI Repository of machine learning databases. Irvine, CA: University of California, Department of Information and Computer Science., 1996.
Google Scholar
P.M. Narendra and K. Fukunaga. A branch and bound algorithm for feature selection. IEEE Trans. on Computers, 26:917–922, 1977.
MATH Google Scholar
J. R. Quinlan. Induction of decision trees. Machine Learning, 1:81–106, 1986.
Google Scholar
D. Wettschereck and D. W. Aha. Weighting features. In Proc. of the First Int. Conference on Case-Based Reasoning, pages 347–358, 1995.
Google Scholar
D. Wettschereck and T. G. Dieterich. An experimental comparison of the nearestneighbor and nearest-hyperrectangle algorithms. Machine Learning, 19:5–27, 1995.
Google Scholar
A. P. White and W. Z. Liu. Bias in information-based measures in decision tree induction. Machine Learning, 15:321–329, 1994.
MATH Google Scholar

Download references

Author information

Authors and Affiliations

Dpto. de Informática y Sistemas, Univ. de Las Palmas de Gran Canaria, 35017, Las Palmas, Spain
Javier Lorenzo, Mario Hernández & Juan Méndez

Authors

Javier Lorenzo
View author publications
You can also search for this author in PubMed Google Scholar
Mario Hernández
View author publications
You can also search for this author in PubMed Google Scholar
Juan Méndez
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Jan M. Żytkow Mohamed Quafafou

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lorenzo, J., Hernández, M., Méndez, J. (1998). Detection of interdependences in attribute selection. In: Żytkow, J.M., Quafafou, M. (eds) Principles of Data Mining and Knowledge Discovery. PKDD 1998. Lecture Notes in Computer Science, vol 1510. Springer, Berlin, Heidelberg . https://doi.org/10.1007/BFb0094822

Download citation

DOI: https://doi.org/10.1007/BFb0094822
Published: 19 October 2006
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-65068-3
Online ISBN: 978-3-540-49687-8
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics