Development of an audio-visual database system for human identification

Bargale, C. B.; Chaudhuri, S.; Bhattacharyya, P.

doi:10.1007/BFb0016014

C. B. Bargale¹,
S. Chaudhuri¹ &
P. Bhattacharyya²

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1206))

Included in the following conference series:

International Conference on Audio- and Video-Based Biometric Person Authentication

2413 Accesses

Abstract

Database systems dealing with textual contents have been in use for a long time. A database management system (DBMS) allows convenient and efficient storage and retrieval of a huge amount of data. Traditional databases are designed for handling alphanumeric data efficiently, but fail to manage complex data like audio and/or video. One dimensional audio data and two dimensional image data can be stored in the form of a binary large object (BLOB) with no emphasis on the contents. Textual information can be attached to BLOBs for retrieval, but mere a textual information is insufficient for describing the rich contents of data. So there is a need to extend the capabilities of such information management system to handle both audio and visual data. Contents of such data items can be extracted in the form of features which can be used for distinction amongst the instances of these data types.

This paper describes how the relational data model can be extended to retrieve face images and audio data in the form of utterances of alphabets. Face images are characterized by sizes of different objects, e.g. nose, lips and the inter-object distances. The audio data is characterized by pitch, formants and LPC coefficients. The purpose of the paper is to develop an automated system for human identification based on audio-visual querying. The system allows the query to be partly audio, partly visual and textual.

Financial assistance from Alexander Von Humboldt Foundation is greately acknowledged.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

J. R. Bach, S. Paul and R. Jain, ”A visual information management system for interactive retrieval of faces”, IEEE Trans, on Knowledge and Data Engg., vol. 5, August 93, pp. 619–628.
Google Scholar
G. Chow and X. Li, ”Toward a system for automatic facial feature detection”, Pattern Recognition, vol. 26, No. 12, 1993, pp. 1739–1755.
Google Scholar
J. Flanagoan, Speech analysis, Synthesis and perception, II ed. Springer-Verlag pub., 1972.
Google Scholar
M. Flinker and H. Sawhaney, ”Query by image and video content: The QBIC system”, IEEE computer, sept. 1995, pp. 23–30.
Google Scholar
A. J. Goldstein, L. D. Harmon, A. B. Lesk, ”Identification of human faces”, Proc. of IEEE, vol. 59, No. 5, May 1971, pp. 749–760.
Google Scholar
W. I. Grosky, ”Towards a data model for integrated pictorial databases”, Computer Vision, Graphics and Image Processing, 25, 1984, pp. 371–382.
Google Scholar
V. Gudivada, V. Raghavan, ”A unified approach to data modeling and retrieval for a class of image database applications”, in Multimedia database system, Springer-Verlag pub., 1996, pp. 37–73.
Google Scholar
F. Itakura, ”Minimum prediction residual principle applied to speech recognition”, IEEE ASSP-23, Feb. 1975, p. 67.
Google Scholar
R. Jain, S. N. J. Murthy, P. L-J Tran, S. Chatterjee, ”Similarity measures for image databases”, in FUZZ-IEEE'95.
Google Scholar
J. Markel, ”Digital inverse filtering a new tool for formant trajectory estimation”, IEEE Trans. AU-20, Jun. 1972, p. 129.
Google Scholar
S. McCandless, ”An algorithm for automatic formant extraction using linear prediction spectra”, IEEE ASSP-22, April 1972, p. 135.
Google Scholar
N. Miller, ”Pitch detection by data reduction”, IEEE ASSP-23, Feb 1975, p. 72.
Google Scholar
V. E. Ogle, ”Chabot: Retrieval from a relational database of images”, IEEE Computer, Sept. 1995, pp. 40–48.
Google Scholar
J. K. Ousterhout, Tcl and Tk Toolkit, Addison-Wesley pub., 1994.
Google Scholar
N. Roeder, X. Li, ”Accuracy analysis for facial feature detection”, Pattern recognition, Jan. 1996, pp. 143–157.
Google Scholar
A. Samal, P. Iyenger, ”Automatic recognition and analysis of human faces and facial expression: A survey”, Pattern Recognition, vol. 25, No. 1, 1992, pp. 65–77.
Google Scholar
S. Santini and R. Jain, ”Similarity queries in image database”, to appear in CVPR, June 96.
Google Scholar
R. Schafer, L. Rabiner, ”Digital representation of speech signals”, IEEE Proc., vol. 63, April 1975, p. 662.
Google Scholar
G. Y. Tang, ”A management system for an integrated database of pictures and alphanumeric data”, Computer Vision, Graphics and Image Processing, 16, 1981, pp. 270–286.
Google Scholar
A. Yoshitaka, S. Kishida and M. Hirakawa, ”Knowledge assisted content based retrieval for multimedia databases”, IEEE Multimedia, winter 1994, pp. 12–21.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electrical Engineering, Indian Institute of Technology, Powai, Bombay, 400 076
C. B. Bargale & S. Chaudhuri
Department of Computer Science and Engineering, Indian Institute of Technology, Powai, Bombay, 400 076
P. Bhattacharyya

Authors

C. B. Bargale
View author publications
You can also search for this author in PubMed Google Scholar
S. Chaudhuri
View author publications
You can also search for this author in PubMed Google Scholar
P. Bhattacharyya
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Josef Bigün Gérard Chollet Gunilla Borgefors

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Bargale, C.B., Chaudhuri, S., Bhattacharyya, P. (1997). Development of an audio-visual database system for human identification. In: Bigün, J., Chollet, G., Borgefors, G. (eds) Audio- and Video-based Biometric Person Authentication. AVBPA 1997. Lecture Notes in Computer Science, vol 1206. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0016014

Download citation

DOI: https://doi.org/10.1007/BFb0016014
Published: 10 June 2005
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-62660-2
Online ISBN: 978-3-540-68425-1
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics