Bottom-Up Audio-Visual Attention for Scene Exploration

Schauerte, Boris

doi:10.1007/978-3-319-33796-8_3

Boris Schauerte⁶

Part of the book series: Cognitive Systems Monographs ((COSMOS,volume 30))

710 Accesses

Abstract

We can differentiate between two attentional mechanisms: First, overt attention directs the sense organs toward salient stimuli to optimize the perception quality.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

eBook: USD 16.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Oppenheim refers to \({\mathscr {F}}[f_p(x)] = \frac{1}{|F(\omega )|}{\mathscr {F}}[f(x)]\) with \(F(\omega ) = {\mathscr {F}}[f](\omega )\).
2.
From a visual saliency perspective, it is not essential to define the case in \(\alpha \) that handles \(p=0\). However, this makes the DCT-II matrix orthogonal, but breaks the direct correspondence with a real-even DFT of half-shifted input. Even more, it is possible to entirely operate without normalization, i.e. remove the \(\alpha \) terms, which results in a scale change that is irrelevant for saliency calculation.
3.
Please note that all operations in Eqs. 3.74 and 3.76 operate element-wise. We chose this simplified notation for its compactness and readability.
4.
Please note that the traveling salesman problem (TSP)’s additional requirement to return to the starting city does not change the computational complexity.

References

Achanta, R., Süsstrunk, S.: Saliency detection using maximum symmetric surround. In: Proceedings of the International Conference on Image Processing (2010)
Google Scholar
Achanta, R., Hemami, S., Estrada, F., Süsstrunk, S.: Frequency-tuned salient region detection. In: Proceedings of the International Conference on Computer Vision and Pattern Recognition (2009)
Google Scholar
Alley, R.E.: Algorithm Theoretical Basis Document for Decorrelation Stretch. NASA, JPL (1996)
Google Scholar
Alsam, A., Sharma, P.: A robust metric for the evaluation of visual saliency algorithms. J. Opt. Soc. Am. (2013)
Google Scholar
Asfour, T., Regenstein, K., Azad, P., Schröder, J., Bierbaum, A., Vahrenkamp, N., Dillmann, R.: ARMAR-III: an integrated humanoid platform for sensory-motor control. In: Humanoids (2006)
Google Scholar
Asfour, T., Welke, K., Azad, P., Ude, A., Dillmann, R.: The Karlsruhe Humanoid Head. In: Humanoids (2008)
Google Scholar
Andreopoulos, A., Hasler, S., Wersing, H., Janssen, H., Tsotsos, J., Körner, E.: Active 3D object localization using a humanoid robot. IEEE Trans. Robot. 47–64 (2010)
Google Scholar
Barlow, H.: Possible principles underlying the transformation of sensory messages. Sens. Commun. 217–234 (1961)
Google Scholar
Bell, A.J., Sejnowski, T.J.: The independent components of scenes are edge filters. Vis. Res. 37(23), 3327–3338 (1997)
Article Google Scholar
Begum, M., Karray, F., Mann, G.K.I., Gosine, R.G.: A probabilistic model of overt visual attention for cognitive robots. IEEE Trans. Syst. Man Cybern. B 40, 1305–1318 (2010)
Article Google Scholar
Bernardo, J.M.: Algorithm as 103 psi(digamma function) computation. Appl. Stat. 25, 315–317 (1976)
Article Google Scholar
Bian, P., Zhang, L.: Biological plausibility of spectral domain approach for spatiotemporal visual saliency. In: Proceedings of the Annual Conference on Neural Information Processing Systems (2009)
Google Scholar
Bruce, N., Tsotsos, J.: Saliency, attention, and visual search: an information theoretic approach. J. Vis. 9(3), 1–24 (2009)
Article Google Scholar
Brown, M., Susstrunk, S., Fua, P.: Spatio-chromatic decorrelation by shift-invariant filtering. In: CVPR Workshop (2011)
Google Scholar
Borji, A., Sihite, D., Itti, L.: What/where to look next? modeling top-down visual attention in complex interactive environments. IEEE Trans. Syst. Man Cybern. A 99 (2013)
Google Scholar
Borji, A., Sihite, D.N., Itti, L.: Quantitative analysis of human-model agreement in visual saliency modeling: a comparative study. IEEE Trans. Image Process. 22(1), 55–69 (2013)
Article MathSciNet Google Scholar
Buchsbaum, G., Gottschalk, A.: Trichromacy, opponent colours coding and optimum colour information transmission in the retina. In: Proceedings of the Royal Society, vol. B, no. 220, pp. 89–113 (1983)
Google Scholar
Butko, N., Zhang, L., Cottrell, G., Movellan, J.R.: Visual saliency model for robot cameras. In: Proceedings of the International Conference on Robotics and Automation (2008)
Google Scholar
Cashon, C., Cohen, L.: The construction, deconstruction, and reconstruction of infant face perception. NOVA Science Publishers: ch, pp. 55–68. The development of face processing in infancy and early childhood, Current perspectives (2003)
Google Scholar
Cerf, M., Harel, J., Einhäuser, W., Koch, C.: Predicting human gaze using low-level saliency combined with face detection. In: Proceedings of the Annual Conference on Neural Information Processing Systems (2007)
Google Scholar
Cerf, M., Frady, P., Koch, C.: Subjects’ inability to avoid looking at faces suggests bottom-up attention allocation mechanism for faces. In: Proceedings of the Society for Neuroscience (2008)
Google Scholar
Cerf, M., Frady, E.P., Koch, C.: Faces and text attract gaze independent of the task: experimental data and computer model. J. Vis. 9 (2009)
Google Scholar
CLEAR2007: Classification of events, activities and relationships evaluation and workshop. http://www.clear-evaluation.org
Comaniciu, D., Meer, P.: Mean shift: a robust approach toward feature space analysis. IEEE Trans. Pattern Anal. Mach. Intell. 603–619 (2002)
Google Scholar
Cormen, T.H., Leiserson, C.E., Rivest, R.L.: Introduction to Algorithms. MIT Press and McGraw-Hill (1990)
Google Scholar
Cox, R.T.: Probability, frequency, and reasonable expectation. Am. J. Phys. 14, 1–13 (1964)
Article MathSciNet MATH Google Scholar
Dankers, A., Barnes, N., Zelinsky, A.: A reactive vision system: active-dynamic saliency. In: Proceedings of the International Conference on Computer Vision Systems (2007)
Google Scholar
DiBiase, J.H., Silverman, H.F., Brandstein, M.S.: Robust localization in reverberant rooms, ch. 8, pp. 157–180. Springer (2001)
Google Scholar
Dragoi, V., Sharma, J., Miller, E.K., Sur, M.: Dynamics of neuronal sensitivity in visual cortex and local feature discrimination. Nat. Neurosci. 883–891 (2002)
Google Scholar
Duan, L., Wu, C., Miao, J., Qing, L., Fu, Y.: Visual saliency detection by spatially weighted dissimilarity. In: Proceedings of the Interantional Conference on Computer Vision and Pattern Recognition (2011)
Google Scholar
Duncan, J.: Selective attention and the organization of visual information. J. Exp. Psychol.: General 113(4), 501–517 (1984)
Google Scholar
Ell, T.: Quaternion-fourier transforms for analysis of two-dimensional linear time-invariant partial differential systems. In: International Conference Decision and Control (1993)
Google Scholar
Ell, T., Sangwine, S.: Hypercomplex fourier transforms of color images. IEEE Trans. Image Process. 16(1), 22–35 (2007)
Article MathSciNet MATH Google Scholar
Egly, R., Driver, J., Rafal, R.D.: Shifting visual attention between objects and locations: evidence from normal and parietal lesion subjects. J. Exp. Psychol.: General, 123(2) (1994)
Google Scholar
Ehrgott, M.: Multicriteria Optimization. Springer (2005)
Google Scholar
Eriksen, C.W.: St James, J.D.: Visual attention within and around the field of focal attention: a zoom lens model. Percept. Psychophys. 40(4), 225–240 (1986)
Article Google Scholar
Essa, I.: Ubiquitous sensing for smart and aware environments. IEEE Pers. Commun. 7(5), 47–49 (2000)
Article Google Scholar
Fleming, K.A., Peters II, R.A., Bodenheimer, R.E.: Image mapping and visual attention on a sensory ego-sphere In: Proceedings of the International Conference on Intelligent Robotics and Systems (2006)
Google Scholar
Feng, W., Hu, B.: Quaternion discrete cosine transform and its application in color template matching. In: International Conference on Image and Signal Processing, pp. 252–256 (2008)
Google Scholar
Frintrop, S., Rome, E., Christensen, H.I.: Computational visual attention systems and their cognitive foundation: a survey. ACM Trans. Appl. Percept. 7(1), 6:1–6:39 (2010)
Google Scholar
Fröba, B., Ernst, A.: Face detection with the modified census transform. In: Proceedings of the International Conference on Automatic Face and Gesture Recognition (2004)
Google Scholar
Gao, D., Mahadevan, V., Vasconcelos, N.: On the plausibility of the discriminant center-surround hypothesis for visual saliency. J. Vis. 8(7), 1–18 (2008)
Article Google Scholar
Geusebroek, J.M., van den Boomgaard, R., Smeulders, A.W.M., Geerts, H.: Color invariance. IEEE Trans. Pattern Anal. Mach. Intell. 23(12), 1338–1350 (2001)
Article Google Scholar
Geusebroek, J.-M., Smeulders, A., van de Weijer, J.: Fast anisotropic gauss filtering. IEEE Trans. Image Process. 12(8), 938–943 (2003)
Article MathSciNet MATH Google Scholar
Gillespie, A.R., Kahle, A.B., Walker, R.E.: Color enhancement of highly correlated images. II. Channel ratio and chromaticity transformation techniques. Remote Sens. Environ. 22(3), 343–365 (1987)
Article Google Scholar
Gillies, D.: The subjective theory. In: Philosophical Theories of Probability. Routledge, ch. 4 (2000)
Google Scholar
Goferman, S., Zelnik-Manor, L., Tal, A.: Context-aware saliency detection. IEEE Trans. Pattern Anal. Mach, Intell (2012)
Google Scholar
Guo, C., Zhang, L.: A novel multiresolution spatiotemporal saliency detection model and its applications in image and video compression. IEEE Trans. Image Process. 19, 185–198 (2010)
Article MathSciNet Google Scholar
Guo, C., Ma, Q., Zhang, L.: Spatio-temporal saliency detection using phase spectrum of quaternion fourier transform. In: Proceedings of the International Conference on Computer Vision and Pattern Recognition (2008)
Google Scholar
Hall, D., Linas, J.: Handbook of Multisensor Data Fusion: Theory and Practice. CRC Press (2008)
Google Scholar
Hamilton, W.R.: Elements of Quaternions. University of Dublin Press (1866)
Google Scholar
Harel, J., Koch, C., Perona, P.: Graph-based visual saliency. In: Proceedings of the Annual Conference on Neural Information Processing Systems (2007)
Google Scholar
Heeger, D.J., Bergen, J.R.: Pyramid-based texture analysis/synthesis. In: Proceedings of the Techniques Annual Conference Special Interest Group on Graphics and Interactive, pp. 229–238 (1995)
Google Scholar
Henderson, J.M.: Human gaze control during real-world scene perception. Trends Cogn. Sci. 498–504 (2003)
Google Scholar
Heracles, M., Körner, U., Michalke, T., Sagerer, G., Fritsch, J., Goerick, C.: A dynamic attention system that reorients to unexpected motion in real-world traffic environments. In: Proceedings of the International Conference on. Intelligent Robots and Systems (2009)
Google Scholar
Hering, E.: Outlines of a Theory of the Light Sense. Harvard University Press (1964)
Google Scholar
Hershey, J., Olsen, P.: Approximating the kullback leibler divergence between gaussian mixture models. In: Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (2007)
Google Scholar
Hodge, V.J., Austin, J.: A survey of outlier detection methodologies. Artif. Intell. Rev. 22, 85–126 (2004)
Article MATH Google Scholar
Holsopple, J., Yang, S.: Designing a data fusion system using a top-down approach. In: Proceedings of the International Conference for Military Communications (2009)
Google Scholar
Hou, X., Zhang, L.: Saliency detection: a spectral residual approach. In: Proceedings of the International Conference on Computer Vision and Pattern Recognition (2007)
Google Scholar
Hou, X., Harel, J., Koch, C.: Image signature: highlighting sparse salient regions. IEEE Trans. Pattern Anal. Mach. Intell. 34(1), 194–201 (2012)
Article Google Scholar
Huang, T., Burnett, J., Deczky, A.: The importance of phase in image processing filters. IEEE Trans. Acoust. Speech Signal Process. 23(6), 529–542 (1975)
Article Google Scholar
Itti, L., Baldi, P.: Bayesian surprise attracts human attention. Vis. Res. 49(10), 1295–1306 (2009)
Article Google Scholar
Itti, L., Baldi, P.F.: A principled approach to detecting surprising events in video. In: Proceedings of the International Conference on Image Processing Computer Vision and Pattern Recognition (2005)
Google Scholar
Itti, L., Baldi, P.F.: Bayesian surprise attracts human attention. In: Proceedings of the Annual Conference on Neural Information Processing Systems (2006)
Google Scholar
Itti, L., Koch, C.: A saliency-based search mechanism for overt and covert shifts of visual attention. Vis. Res. 40(10–12), 1489–1506 (2000)
Article Google Scholar
Itti, L., Koch, C., Niebur, E.: A model of saliency-based visual attention for rapid scene analysis. IEEE Trans. Pattern Anal. Mach. Intell. 20(11), 1254–1259 (1998)
Article Google Scholar
Jaynes, E.T.: Probability Theory. The Logic of Science Cambridge University Press (2003)
Google Scholar
Judd, T., Ehinger, K., Durand, F., Torralba, A.: Learning to predict where humans look. In: Proceedings of the International Conference on Computer Vision (2009)
Google Scholar
Judd, T., Durand, F., Torralba, A.: Fixations on low-resolution images. J. Vis. 11(4) (2011)
Google Scholar
Judd, T., Durand, F., Torralba, A.: A benchmark of computational models of saliency to predict human fixations. Technical Report, MIT (2012)
Google Scholar
Johnson, D., McGeoch, L.: The traveling salesman problem: a case study in local optimization. Local search in combinatorial optimization, pp. 215–310 (1997)
Google Scholar
Jost, T., Ouerhani, N., von Wartburg, R., Mäuri, R., Häugli, H.: Assessing the contribution of color in visual attention. Comput. Vis. Image Underst. 100, 107–123 (2005)
Article Google Scholar
Kalinli, O.: Biologically inspired auditory attention models with applications in speech and audio processing, Ph.D. dissertation, University of Southern California, Los Angeles, CA, USA (2009)
Google Scholar
Kalinli, O., Narayanan, S.: Prominence detection using auditory attention cues and task-dependent high level information. IEEE Trans. Audio Speech Lang. Proc. 17(5), 1009–1024 (2009)
Article Google Scholar
Kahneman, D., Treisman, A.: Varieties of Attention. Academic Press (2000), ch. Changing views of attention and automaticity, pp. 26–61
Google Scholar
Kahneman, D., Treisman, A., Gibbs, B.J.: The reviewing of object files: object-specific integration of information. Cogn. Psychol. 24(2), 175–219 (1992)
Article Google Scholar
Kayser, C., Petkov, C.I., Lippert, M., Logothetis, N.K.: Mechanisms for allocating auditory attention: an auditory saliency map. Curr. Biol. 15(21), 1943–1947 (2005)
Article Google Scholar
Klin, A., Jones, W., Schultz, R., Volkmar, F., Cohen, D.: Visual fixation patterns during viewing of naturalistic social situations as predictors of social competence in individuals with autism. Arch. Gen. Psychiatry 59(9), 809–816 (2002)
Article Google Scholar
Kootstra, G., Nederveen, A., de Boer, B.: Paying attention to symmetry. In: Proceedings of the British Conference on Computer Vision (2008)
Google Scholar
Kühn, B., Belkin, A., Swerdlow, A., Machmer, T., Beyerer, J., Kroschel, K.: Knowledge-driven opto-acoustic scene analysis based on an object-oriented world modelling approach for humanoid robots. In: Proceedings of the 41st International Symposium Robotics and 6th German Conference on Robotics (2010)
Google Scholar
Li, J., Levine, M.D., An, X., He, H.: Saliency detection based on frequency and spatial domain analysis. In: Proceedings of the British Conference on Computer Vision (2011)
Google Scholar
Liang, Y., Simoncelli, E., Lei, Z.: Color channels decorrelation by ica transformation in the wavelet domain for color texture analysis and synthesis. Proceedings of the International Conference on Computer Vision and Pattern Recognition 1, 606–611 (2000)
Google Scholar
Lichtenauer, J., Hendriks, E. Reinders, M.: Isophote properties as features for object detection. In: Proceedings of the International Conference on Computer Vision and Pattern Recognition (2005)
Google Scholar
Lin, K.-H., Zhuang, X., Goudeseune, C., King, S., Hasegawa-Johnson, M., Huang, T.S.: Improving faster-than-real-time human acoustic event detection by saliency-maximized audio visualization. In: Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (2012)
Google Scholar
Lu, S., Lim, J.-H.: Saliency modeling from image histograms. In: Proceedings of the European Conference on Computer Vision (2012)
Google Scholar
Luo, W., Li, H., Liu, G., Ngan, K.N.: Global salient information maximization for saliency detection. Signal Process.: Image Commun. 27, 238–248 (2012)
Google Scholar
Machmer, T., Moragues, J., Swerdlow, A., Vergara, L., Gosalbez-Castillo, J., Kroschel, K.: Robust impulsive sound source localization by means of an energy detector for temporal alignment and pre-classification. In: Proceedings of the European Signal Processing of Conference (2009)
Google Scholar
Machmer, T., Swerdlow, A., Kühn, B., Kroschel, K.: Hierarchical, knowledge-oriented opto-acoustic scene analysis for humanoid robots and man-machine interaction. In: Proceedings of the International Conference on Robotics and Automation (2010)
Google Scholar
Meger, D., Forssén, P.-E., Lai, K., Helmer, S., McCann, S., Southey, T., Baumann, M., Little, J.J., Lowe, D.G.: Curious George: an attentive semantic robot. In: IROS Workshop: From sensors to human spatial concepts (2007)
Google Scholar
Meur, O.L., Callet, P.L., Barba, D.: Predicting visual fixations on video based on low-level visual features. J. Vis. 47(19), 2483–2498 (2006)
Google Scholar
Muller, J.R., Metha, A.B., Krauskopf, J., Lennie, P.: Rapid adaptation in visual cortex to the structure of images. Science 285, 1405–1408 (1999)
Article Google Scholar
Nakajima, J., Sugimoto, A., Kawamoto, K.: Incorporating audio signals into constructing a visual saliency map. In: Klette, R., Rivera, M., Satoh, S. (eds.) Image and Video Technology, Series Lecture Notes in Computer Science, vol. 8333. Springer, Berlin, Heidelberg (2014)
Google Scholar
Olmos, A., Kingdom, F.A.A.: A biologically inspired algorithm for the recovery of shading and reflectance images. Perception 33, 1463–1473 (2004)
Article Google Scholar
Olshausen, B.A., Field, D.J.: Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature 381, 607–609 (1996)
Article Google Scholar
Onat, S., Libertus, K., König, P.: Integrating audiovisual information for the control of overt attention. J. Vis. 7(10) (2007)
Google Scholar
Oppenheim, A., Lim, J.: The importance of phase in signals. Proc. IEEE 69(5), 529–541 (1981)
Article Google Scholar
Orabona, F., Metta, G., Sandini, G.: A proto-object based visual attention model. In: Paletta, L., Rome, E. (eds.) Attention in Cognitive Systems. Theories and Systems from an Interdisciplinary Viewpoint, pp. 198–215 (2008)
Google Scholar
Parkhurst, D., Law, K., Niebur, E.: Modeling the role of salience in the allocation of overt visual attention. Vis. Res. 42(1), 107–123 (2002)
Article Google Scholar
Pascale, D.: A review of RGB color spaces...from xyY to R’G’B’ (2008)
Google Scholar
Peters, R.J., Itti, L.: Applying computational tools to predict gaze direction in interactive visual environments. ACM Trans. Appl. Percept. 5(2) (2008)
Google Scholar
Peters, R., Itti, L.: The role of fourier phase information in predicting saliency. J. Vis. 8(6), 879 (2008)
Article Google Scholar
Peters, R., Iyer, A., Itti, L., Koch, C.: Components of bottom-up gaze allocation in natural images. Vis. Res. 45(18), 2397–2416 (2005)
Article Google Scholar
Posner, M.I.: Orienting of attention. Q. J. Exp. Psychol. 32(1), 3–25 (1980)
Article MathSciNet Google Scholar
Rajashekar, U., Bovik, A.C., Cormack, L.K.: Visual search in noise: revealing the influence of structural cues by gaze-contingent classïňA̧cation image analysis. J. Vis. 6(4), 379–386 (2006)
Article Google Scholar
Ramenahalli, S., Mendat, D.R., Dura-Bernal, S., Culurciello, E., Niebur, E., Andreou, A.: Audio-visual saliency map: overview, basic models and hardware implementation. In: Annual Conference on Information Sciences and Systems (2013)
Google Scholar
Rao, R.P., Ballard, D.H.: Predictive coding in the visual cortex: a functional interpretation of some extra-classical receptive-field effects. Nat. Neurosci. 79–87 (1999)
Google Scholar
Ratliff, F.: Mach Bands: Quantitative Studies on Neural Networks in the Retina. Holden-Day, San Francisco (1965)
Google Scholar
Reinhard, E., Pouli, T.: Colour spaces for colour transfer. Computational Color Imaging, series Lecture Notes in Computer Science 6626, 1–15 (2011)
Article Google Scholar
Reinhard, E., Adhikhmin, M., Gooch, B., Shirley, P.: Color transfer between images. IEEE Comput. Graph. Appl. 21(5), 34–41 (2001)
Article Google Scholar
Rensink, R.A.: The dynamic representation of scenes. Vis. Cogn. 7, 17–42 (2000)
Article Google Scholar
Rensink, R.A.: Seeing, sensing, and scrutinizing. Vis. Res. 40, 1469–1487 (2000)
Article Google Scholar
Riche, N., Duvinage, M., Mancas, M., Gosselin, B., Dutoit, T.: Saliency and human fixations: state-of-the-art and study of comparison metrics. In: Proceedings of the International Conference on Computer Vision (2013)
Google Scholar
RobotCub Consortium: iCub—an open source cognitive humanoid robotic platform. http://www.icub.org
Ruderman, D., Cronin, T., Chiao, C.: Statistics of cone responses to natural images: implications for visual coding. J. Opt. Soc. Am. 15(8), 2036–2045 (1998)
Article Google Scholar
Ruesch, J., Lopes, M., Bernardino, A., Hornstein, J., Santos-Victor, J., Pfeifer, R.: Multimodal saliency-based bottom-up attention: a framework for the humanoid robot iCub. In: Proceedings of the International Conference on Robotics and Automation (2008)
Google Scholar
Roelfsema, P.R., Lamme, V.A.F., Spekreijse, H.: Object-based attention in the primary visual cortex of the macaque monkey. Nature 395, 376–381 (1998)
Article Google Scholar
Sangwine, S.J.: Fourier transforms of colour images using quaternion or hypercomplex, numbers. Electron. Lett. 32(21), 1979–1980 (1996)
Article Google Scholar
Sangwine, S., Ell, T.: Colour image filters based on hypercomplex convolution. IEEE Proc. Vis. Image Signal Process. 147(2), 89–93 (2000)
Article Google Scholar
Saidi, F., Stasse, O., Yokoi, K., Kanehiro, F.: Online object search with a humanoid robot. In: Proceedings of the International Conference on Intelligent Robots and Systems (2007)
Google Scholar
Schauerte, B., Richarz, J., Plötz, T., Thurau, C., Fink, G.A.: Multi-modal and multi-camera attention in smart environments. In: Proceedings of the 11th International Conference on Multimodal Interfaces (ICMI). ACM, Cambridge, MA, USA, Nov 2009
Google Scholar
Schauerte, B., Richarz, J., Fink, G.A.: Saliency-based identification and recognition of pointed-at objects. In: Proceedings of the 23rd International Conference on Intelligent Robots and Systems (IROS). IEEE/RSJ, Taipei, Taiwan, Oct. 2010
Google Scholar
Schauerte, B., Fink, G.A.: Focusing computational visual attention in multi-modal human-robot interaction. In: Proceedings of the 12th International Conference on Multimodal Interfaces and 7th Workshop on Machine Learning for Multimodal Interaction (ICMI-MLMI). ACM, Beijing, China, Nov. 2010
Google Scholar
Schnupp, J., Nelken, I., King, A.: Auditory Neuroscience. MIT Press (2011)
Google Scholar
Serences, J.T., Yantis, S.: Selective visual attention and perceptual coherence. Trends Cogn. Sci. 10(1), 38–45 (2006)
Article Google Scholar
Shic, F., Scassellati, B.: A behavioral analysis of computational models of visual attention. Int. J. Comput. Vis. 73, 159–177 (2007)
Article Google Scholar
Shulman, G.L., Wilson, J.: Spatial frequency and selective attention to spatial location. Perception 16(1), 103–111 (1987)
Article Google Scholar
Simion, C., Shimojo, S.: Early interactions between orienting, visual sampling and decision making in facial preference. Vis. Res. 46(20), 3331–3335 (2006)
Article Google Scholar
Smith, T., Guild, J.: The C.I.E. colorimetric standards and their use. Trans. Opt. Soc. 33(3), 73 (1931)
Article Google Scholar
Song, G., Pellerin, D., Granjon, L.: How different kinds of sound in videos can influence gaze. In: Interantional Workshop on Image Analysis for Multimedia Interactive Services (2012)
Google Scholar
Tatler, B., Baddeley, R., Gilchrist, I.: Visual correlates of fixation selection: effects of scale and time. J. Vis. 45(5), 643–659 (2005)
Google Scholar
Temko, A., Malkin, R., Zieger, C., Macho, D., Nadeu, C., Omologo, M.: Clear evaluation of acoustic event detection and classification systems. In: Stiefelhagen, R., Garofolo, J. (eds.) Series Lecture Notes in Computer Science, vol. 4122, pp. 311–322. Springer, Berlin, Heidelberg (2007)
Google Scholar
Tipper, S.P., Driver, J., Weaver, B.: Object-centred inhibition of return of visual attention. Q. J. Exp. Psychol. 43, 289–298 (1991)
Article Google Scholar
Torralba, A., Oliva, A., Castelhano, M.S., Henderson, J.M.: Contextual guidance of eye movements and attention in real-world scenes: the role of global features in object search. Psychol. Rev. 113(4) (2006)
Google Scholar
Treisman, A.M., Gelade, G.: A feature-integration theory of attention. Cogn. Psychol. 12(1), 97–136 (1980)
Article Google Scholar
Tsotsos, J.K.: The complexity of perceptual search tasks. In: Proceedings of the International Joint Conference on Artificial Intelligence (1989)
Google Scholar
Tsotsos, J.K.: Behaviorist intelligence and the scaling problem. Artif. Intell. 75, 135–160 (1995)
Article MathSciNet MATH Google Scholar
Tsotsos, J.K.: A Computational Perspective on Visual Attention. The MIT Press (2011)
Google Scholar
van Rijsbergen, C.J.: Information Retrieval, 2nd edn. Butterworth (1979)
Google Scholar
Vijayakumar, S., Conradt, J., Shibata, T., Schaal, S.: Overt visual attention for a humanoid robot. In: Proceedings of the International Conference on Intelligent Robotics and Systems (2001)
Google Scholar
Walther, D., Koch, C.: Modeling attention to salient proto-objects. Neural Networks 19(9), 1395–1407 (2006)
Article MATH Google Scholar
Wang, C.-A., Boehnke, S., Munoz, D.: Pupil dilation evoked by a salient auditory stimulus facilitates saccade reaction times to a visual stimulus. J. Vis. 12(9), 1254 (2012)
Article Google Scholar
Welke, K.: Memory-based active visual search for humanoid robots, Ph.D. dissertation, Karlsruhe Institute of Technology (2011)
Google Scholar
Welke, K., Asfour, T., Dillmann, R..: Active multi-view object search on a humanoid head. In: Proceedings of the International Conference on Robotics and Automation (2009)
Google Scholar
Welke, K., Asfour, T., Dillmann, R.: Inhibition of return in the bayesian strategy to active visual search. In: Proceedings of the International Conference on Machine Vision Applications (2011)
Google Scholar
Wegener, I.: Theoretische Informatik—eine algorithmenorientierte Einführung. Teubner (2005)
Google Scholar
Wikimedia Common (Googolplexbyte): Diagram of the opponent process. http://commons.wikimedia.org/wiki/File:Diagram_of_the_opponent_process.png, retrieved 3 April 2014, License CC BY-SA 3.0
Winkler, S., Subramanian, R.: Overview of eye tracking datasets. In: International Workshop on Quality of Multimedia Experience (2013)
Google Scholar
Wu, P.-H., Chen, C.-C., Ding, J.-J., Hsu, C.-Y., Huang, Y.-W.: Salient region detection improved by principle component analysis and boundary information. IEEE Trans. Image Process. 22(9), 3614–3624 (2013)
Article Google Scholar
Xu, T., Chenkov, N., Kühnlenz, K., Buss, M.: Autonomous switching of top-down and bottom-up attention selection for vision guided mobile robots. In: Proceedings of the International Conference on Intelligent Robotics and Systems (2009)
Google Scholar
Xu, T., Pototschnig, T., Kühnlenz, K., Buss, M.: A high-speed multi-GPU implementation of bottom-up attention using CUDA. In: Proceedings of the International Conference on Robotics and Automation (2009)
Google Scholar
Yu, Y., Gu, J., Mann, G., Gosine, R.: Development and evaluation of object-based visual attention for automatic perception of robots. IEEE Trans. Autom. Sci. Eng. 10(2), 365–379 (2013)
Article Google Scholar
Zadeh, L.: Fuzzy sets. Inform. Control 8(3), 338–353 (1965)
Article MathSciNet MATH Google Scholar
Zhao, Q., Koch, C.: Learning a saliency map using fixated locations in natural scenes. J. Vis. 11(3), 1–15 (2011)
Article Google Scholar
Zhang, L., Tong, M.H., Marks, T.K., Shan, H., Cottrell, G.W.: Sun: a bayesian framework for saliency using natural statistics. J. Vis. 8(7) (2008)
Google Scholar
Zhou, J., Jin, Z., Yang, J.: Multiscale saliency detection using principle component analysis. In: International Joint Conference on Neural Networks, pp. 1–6 (2012)
Google Scholar

Download references

Author information

Authors and Affiliations

Karlsruhe, Germany
Boris Schauerte

Authors

Boris Schauerte
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Boris Schauerte .

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Schauerte, B. (2016). Bottom-Up Audio-Visual Attention for Scene Exploration. In: Multimodal Computational Attention for Scene Understanding and Robotics. Cognitive Systems Monographs, vol 30. Springer, Cham. https://doi.org/10.1007/978-3-319-33796-8_3

Download citation

DOI: https://doi.org/10.1007/978-3-319-33796-8_3
Published: 12 May 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-33794-4
Online ISBN: 978-3-319-33796-8
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics