Skip to main content

Crime Feature Selection Constructing Weighted Spanning Tree

  • Conference paper
  • First Online:
Computational Intelligence in Pattern Recognition

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 999))

Abstract

The proposed work demonstrates a rough set based feature selection scheme for selecting crime features from online newspaper reports of crime performed against women in India. Only the verbs present in the crime reports are considered as the extracted features for crime analysis task. To select only the distinct verbs, all the words with common synonyms are identified and replaced by a single word. Most often the set of features contains the relevant as well as many irrelevant features. Hence, for any classification task, it is highly essential to select only the relevant features for accurate classification. In the proposed work, the rough set theory based relative indiscernibility relation is used to measure the similarity score between two features relative to the crime type. Then a weighted undirected graph has been generated that comprises the features as nodes and the inverse similarity score between two features as the weight of the corresponding edge. Prim’s algorithm is applied to obtain a minimal spanning tree. Finally, a feature selection algorithm has been developed that selects the highest degree node and removes it from the tree iteratively until the modified graph becomes a null graph. The selected nodes are considered as the important features sufficient for crime reports categorization.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Pawlak, Z.: Rough set theory and its applications to data analysis. Cybern. Syst. 29(7), 661–688 (1998)

    Article  Google Scholar 

  2. Hu, X.T., Lin, T.Y., Han, J.: A new rough sets model based on database systems. In: Rough Sets, Fuzzy Sets, Data Mining, and Granular Computing, pp. 114–121 (2003)

    Google Scholar 

  3. Zhang, M., Yao, J.T.: A rough sets based approach to feature selection. In: IEEE Annual Meeting of the Fuzzy Information, 2004. Processing NAFIPS ’04, vol. 1, pp. 434–439 (2004)

    Google Scholar 

  4. Song, Q., Ni, J., Wang, G.: A fast clustering-based feature subset selection algorithm for high-dimensional data. IEEE Trans. Knowl. Data Eng. 25(1), 1–14 (2013)

    Article  Google Scholar 

  5. Yaswanth Kumar Alapati, K., Sindhu, S.S.: Relevant feature selection from high-dimensional data using mst based clustering. Int. J. Emerg. Trends Sci. Technol. 2(3), 1997–2001 (2015)

    Google Scholar 

  6. Singh, B., Sankhwar, J.S., Vyas, O.P.: Optimization of feature selection method for high dimensional data using fisher score and minimum spanning tree. In: 2014 Annual IEEE India Conference (INDICON), pp. 1–6 (2014)

    Google Scholar 

  7. Taha, K., Yoo, P.D.: Using the spanning tree of a criminal network for identifying its leaders. IEEE Trans. Inf. Forensics Secur. 12(2), 445–453 (2017)

    Article  Google Scholar 

  8. Das, P., Das, A.K.: An application of strength pareto evolutionary algorithm for feature selection from crime data. In: 2017 8th International Conference on Computing, Communication and Networking Technologies (ICCCNT), pp. 1–6 (2017)

    Google Scholar 

  9. Loper, E., Bird, S.: NLTK: The natural language toolkit. In: Proceedings of the ACL-02 Workshop on Effective Tools and Methodologies for Teaching Natural Language Processing and Computational Linguistics. ETMTNLP ’02, pp. 63–70 (2002)

    Google Scholar 

  10. Csardi, G., Nepusz, T.: The igraph software package for complex network research. InterJ. Complex Syst. 1695 (2006)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Priyanka Das .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Das, P., Das, A.K. (2020). Crime Feature Selection Constructing Weighted Spanning Tree. In: Das, A., Nayak, J., Naik, B., Pati, S., Pelusi, D. (eds) Computational Intelligence in Pattern Recognition. Advances in Intelligent Systems and Computing, vol 999. Springer, Singapore. https://doi.org/10.1007/978-981-13-9042-5_33

Download citation

Publish with us

Policies and ethics