Skip to main content

Topic Indexing of TV Broadcast News Programs

  • Conference paper
  • First Online:
Book cover Computational Processing of the Portuguese Language (PROPOR 2003)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2721))

Abstract

This paper describes a topic segmentation and indexation system for TV broadcast news programs spoken in European Portuguese. The system is integrated in an alert system for selective dissemination of multimedia information developed in the scope of an European Project. The goal of this work is to enhance the retrieval of specific spoken documents that have been automatically transcribed, using speech recognition. Our segmentation algorithm is based on simple heuristics related with anchor detection. The indexation is based on hierarchical concept trees (thesaurus), containing 22 main thematic domains, for which Hidden Markov models and topic language models were created. On-going experiments related to multiple topic indexing are also described, where a confidence measure based on the likelihood ratio test is used as the hypothesis test.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Fiscus, J., Doddington, G., Garofolo, J., Martin, A., “NIST’S 1998 Topic Detection and Tracking Evaluation (TDT2)”, in Proc. DARPA Broadcast News Workshop, Feb. 1999.

    Google Scholar 

  2. Yamron, J. P., Carp, I., Gillick, L., Lowe, S., “A Hidden Markov Model Approach to Text Segmentation and Event Tracking”, in Proceedings of ICASSP-98, Seattle, May 1998.

    Google Scholar 

  3. Clarkson, P., Rosenfeld, R., “Statistical Language Modeling using the CMU-Cambridge Toolkit”, in Proc. EUROSPEECH 97, Rhodes, Greece, 1997.

    Google Scholar 

  4. Alexander Gelbukh, Grigori Sidorov and Adolfo Guzmán-Arenas: Document Indexing With a Concept Hierarchy. In: New Developments in Digital Libraries. Proceedings of the 1st International Workshop on New Developments in Digital Libraries (NDDL-2001). ICEIS PRESS, Setúbal, 2001.

    Google Scholar 

  5. H. Meinedo, N. Souto, J. Neto: Speech Recognition of Broadcast News for the European Portuguese language. Proceedings ASRU’2001-IEEE Automatic Speech Recognition and Understanding Workshop, Madonna di Campiglio, Italy, December 2001.

    Google Scholar 

  6. C. Hagège: SMORPH: um analisador/gerador morfológico para o português., Lisboa, Portugal, 1997.

    Google Scholar 

  7. NIST Speech Group: The 2001 Topic Detection and Tracking (TDT2001) Task Definition and Evaluation Plan. ftp://jaguar.ncsl.nist.gov//tdt/tdt2001/evalplans/TDT01.Eval.Plan.v1.2.ps, 15 November 2002.

    Google Scholar 

  8. Ng, K., “Survey of Approaches to Information Retrieval of Speech Messages” Technical report, Spoken Language Systems Group, MIT, February 1996.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Amaral, R., Trancoso, I. (2003). Topic Indexing of TV Broadcast News Programs. In: Mamede, N.J., Trancoso, I., Baptista, J., das Graças Volpe Nunes, M. (eds) Computational Processing of the Portuguese Language. PROPOR 2003. Lecture Notes in Computer Science(), vol 2721. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45011-4_35

Download citation

  • DOI: https://doi.org/10.1007/3-540-45011-4_35

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-40436-1

  • Online ISBN: 978-3-540-45011-5

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics