Skip to main content

Improved IR in Cohesion Model for Link Detection System

  • Conference paper
Advances in Data Mining. Theoretical Aspects and Applications (ICDM 2007)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4597))

Included in the following conference series:

  • 706 Accesses

Abstract

Given two stories, Story Link Detection System identifies whether they are discussing the same event. Standard approach in link detection system is to use cosine similarity measure to find whether the two documents are linked. Many researchers applied query expansion technique successfully in link detection system, where models are built from the relevant documents retrieved from the collection using query expansion. In this approach, success depends on the quality of the information retrieval system. In the current research, we propose a new information retrieval system for query expansion that uses intra-cluster similarity of the retrieved documents in addition to the similarity with respect to the query document. Our technique enhances the quality of the retrieval system thus improving the performance of the Link Detection System. Combining this improved IR with our Cohesion Model provides excellent result in link detection. Experimental results confirm the effect of the improved retrieval system in query expansion technique.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Allan, J.: Introduction to Topic Detection and Tracking, Topic Detection and Tracking: Event-based Information Organization, pp. 1–16. Kluwer Academic Publishers, Dordrecht (2002)

    Google Scholar 

  2. Allan, J., Carbonell, J., Doddington, G., Yamron, J., Yang, Y.: Topic detection and tracking pilot study: Final report. In: Proceedings of the DARPA Broadcast News Transcription and Understanding Workshop, pp. 194–218. Morgan Kaufmann publishers, San Francisco (1998)

    Google Scholar 

  3. Topic Detection and Tracking (TDT) Project.homepage: http://www.nist.gov/speech/tests/tdt/

  4. Lavrenko, V.: A Generative Theory of Relevance, PhD Thesis, University Of Massachusetts Amherst (September 2004)

    Google Scholar 

  5. Chen, F., Farahat, A., Brants, T.: Multiple Similarity Measures and Source-Pair Information in Story Link Detection. In: Proceedings of HLT-NAACL, pp. 313–320 (2004)

    Google Scholar 

  6. Lavrenko, V., Allan, J., DeGuzman, E., LaFlamme, D., Pollard, V., Thomas, S.: Relevance models for topic detection and tracking. In: Proceedings of Human Language Technologies Conference, HLT, pp. 104–110 (2002)

    Google Scholar 

  7. Yang, Y., Ault, T., Pierce, T., Lattimer, C.W.: Improving text categorization methods for event tracking. In: SIGIR 2000. Proceedings of the 23rd Annual international ACM SIGIR Conference on Research and Development in information Retrieval, Athens, Greece, July 24-28, 2000, pp. 65–72. ACM Press, New York (2000)

    Chapter  Google Scholar 

  8. Farahat, A., Chen, F., Brants, T.: Optimizing Story Link Detection is not Equivalent to Optimizing New Event Detection. In: Dignum, F.P.M. (ed.) ACL 2003. LNCS (LNAI), vol. 2922, pp. 232–239. Springer, Heidelberg (2004)

    Google Scholar 

  9. Nallapati, R., Allan, J.: Capturing Term Dependencies using a Language Model based on Sentence Trees. In: CIKM 2002, McLean, Virginia (November 4-9, 2002)

    Google Scholar 

  10. Lakshmi, K., Mukherjee, S.: An Improved Feature Selection using Maximized Signal to Noise Ratio Technique for TC. In: ITNG 2006. Proceedings of Information Technology: New Generations, pp. 541–546 (April 2006)

    Google Scholar 

  11. Allan, J., Lavrenko, V., Frey, D., Khandelwal, V.: UMass at TDT 2000. In: Proceedings of the Topic Detection and Tracking Workshop (2000)

    Google Scholar 

  12. Figueroa, M., Lawrence Kincaid, D., Rani, M., Lewis, G. (eds.): Communication for Social Change: An Integrated Model for Measuring the Process and Its Outcomes. The Rockefeller Foundation New York (2002)

    Google Scholar 

  13. Raghavan, H., Allan, J.: Using soundex codes for indexing names in ASR documents. In: Proceedings of the HLT NAACL Workshop on Interdisciplinary Approaches to Speech Indexing and Retrieval (2004)

    Google Scholar 

  14. Lakshmi, K., Mukherjee, S.: Using Cohesion-Model for Story Link Detection System. IJCSNS International Journal of Computer Science and Network Security 7(3), 59–66 (2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Petra Perner

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Lakshmi, K., Mukherjee, S. (2007). Improved IR in Cohesion Model for Link Detection System. In: Perner, P. (eds) Advances in Data Mining. Theoretical Aspects and Applications. ICDM 2007. Lecture Notes in Computer Science(), vol 4597. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73435-2_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-73435-2_12

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-73434-5

  • Online ISBN: 978-3-540-73435-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics