Crime Feature Selection Constructing Weighted Spanning Tree

Das, Priyanka; Das, Asit Kumar

doi:10.1007/978-981-13-9042-5_33

Priyanka Das¹⁹ &
Asit Kumar Das¹⁹

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 999))

2004 Accesses
1 Citations

Abstract

The proposed work demonstrates a rough set based feature selection scheme for selecting crime features from online newspaper reports of crime performed against women in India. Only the verbs present in the crime reports are considered as the extracted features for crime analysis task. To select only the distinct verbs, all the words with common synonyms are identified and replaced by a single word. Most often the set of features contains the relevant as well as many irrelevant features. Hence, for any classification task, it is highly essential to select only the relevant features for accurate classification. In the proposed work, the rough set theory based relative indiscernibility relation is used to measure the similarity score between two features relative to the crime type. Then a weighted undirected graph has been generated that comprises the features as nodes and the inverse similarity score between two features as the weight of the corresponding edge. Prim’s algorithm is applied to obtain a minimal spanning tree. Finally, a feature selection algorithm has been developed that selects the highest degree node and removes it from the tree iteratively until the modified graph becomes a null graph. The selected nodes are considered as the important features sufficient for crime reports categorization.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Pawlak, Z.: Rough set theory and its applications to data analysis. Cybern. Syst. 29(7), 661–688 (1998)
Article Google Scholar
Hu, X.T., Lin, T.Y., Han, J.: A new rough sets model based on database systems. In: Rough Sets, Fuzzy Sets, Data Mining, and Granular Computing, pp. 114–121 (2003)
Google Scholar
Zhang, M., Yao, J.T.: A rough sets based approach to feature selection. In: IEEE Annual Meeting of the Fuzzy Information, 2004. Processing NAFIPS ’04, vol. 1, pp. 434–439 (2004)
Google Scholar
Song, Q., Ni, J., Wang, G.: A fast clustering-based feature subset selection algorithm for high-dimensional data. IEEE Trans. Knowl. Data Eng. 25(1), 1–14 (2013)
Article Google Scholar
Yaswanth Kumar Alapati, K., Sindhu, S.S.: Relevant feature selection from high-dimensional data using mst based clustering. Int. J. Emerg. Trends Sci. Technol. 2(3), 1997–2001 (2015)
Google Scholar
Singh, B., Sankhwar, J.S., Vyas, O.P.: Optimization of feature selection method for high dimensional data using fisher score and minimum spanning tree. In: 2014 Annual IEEE India Conference (INDICON), pp. 1–6 (2014)
Google Scholar
Taha, K., Yoo, P.D.: Using the spanning tree of a criminal network for identifying its leaders. IEEE Trans. Inf. Forensics Secur. 12(2), 445–453 (2017)
Article Google Scholar
Das, P., Das, A.K.: An application of strength pareto evolutionary algorithm for feature selection from crime data. In: 2017 8th International Conference on Computing, Communication and Networking Technologies (ICCCNT), pp. 1–6 (2017)
Google Scholar
Loper, E., Bird, S.: NLTK: The natural language toolkit. In: Proceedings of the ACL-02 Workshop on Effective Tools and Methodologies for Teaching Natural Language Processing and Computational Linguistics. ETMTNLP ’02, pp. 63–70 (2002)
Google Scholar
Csardi, G., Nepusz, T.: The igraph software package for complex network research. InterJ. Complex Syst. 1695 (2006)
Google Scholar

Download references

Author information

Authors and Affiliations

Indian Institute of Engineering Science and Technology, Shibpur, Howrah, 711103, India
Priyanka Das & Asit Kumar Das

Authors

Priyanka Das
View author publications
You can also search for this author in PubMed Google Scholar
Asit Kumar Das
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Priyanka Das .

Editor information

Editors and Affiliations

Department of Computer Science and Technology, Indian Institute of Engineering Science and Technology, Howrah, West Bengal, India
Asit Kumar Das
Department of Computer Science and Engineering, Sri Sivani College of Engineering, Srikakulam, Andhra Pradesh, India
Janmenjoy Nayak
Department of Computer Application, Veer Surendra Sai University of Technology, Burla, Sambalpur, Odisha, India
Bighnaraj Naik
Department of Bioinformatics, Maulana Abul Kalam Azad University of Technology, Kolkata, West Bengal, India
Soumen Kumar Pati
Faculty of Communication Sciences, University of Teramo, Teramo, Italy
Danilo Pelusi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Das, P., Das, A.K. (2020). Crime Feature Selection Constructing Weighted Spanning Tree. In: Das, A., Nayak, J., Naik, B., Pati, S., Pelusi, D. (eds) Computational Intelligence in Pattern Recognition. Advances in Intelligent Systems and Computing, vol 999. Springer, Singapore. https://doi.org/10.1007/978-981-13-9042-5_33

Download citation

DOI: https://doi.org/10.1007/978-981-13-9042-5_33
Published: 18 August 2019
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-9041-8
Online ISBN: 978-981-13-9042-5
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics