1 Introduction

The “big data revolution” presents challenges for the designers of new technologies that capture and analyze data. Researchers and businesses are also challenged to find the most effective ways to use the large amounts of data that are now available to them. According to IBM, 2.5 exabytes (2.5 billion gigabytes) of data were generated daily in 2012 [1], and that number is increasing each year. Harvard Professor Gary King [2] pointed out that the big data revolution is not the quantity of big data, but the fact that we now have the necessary software for statistical and computational analysis of the data, as well as tools for linking datasets and visualizing the information.

However, with the large amounts of data that are being generated each day, new forms of data representation are necessary, so users can represent multiple variables simultaneously, identify trends from different perspectives, and communicate the meaning of the data to diverse audiences. In addition, there is a need to create tools that work across disciplines for interdisciplinary research.

Multisensory data representation can enhance the user experience and make data analysis accessible to different audiences through intuitive tools that leverage our innate abilities to understand information with different sensory modalities [3]. Loftin [4] noted that there are limits to the number of variables we can process visually, but the brain is capable of processing information from multiple senses simultaneously. By using multiple senses, it is possible to increase the number of variables and relationships that can be represented in complex data sets.

When designing these tools, it is important to understand the semiotic structure of each design element because the tools must map the audiovisual semantic and syntactic relationships to the variables in the data. Users must also be able to use different media to expand and refine paths through the data and create visual hierarchies that highlight specific relationships, patterns, and outliers. The exploration of complex data sets also requires flexible ways to organize and store data. Audiovisual metadata supports data exploration by providing a fluid database structure that encourages new perspectives and facilitates collaboration between users from different research communities.

This paper includes an analysis of the semiotics of multisensory data representation and provides some design objectives for audiovisual metadata for data organization.

2 Multimedia Data Representation

In multisensory data design, the granularity of audiovisual design ranges from specific or localized representation, which is achieved with graphics, to infinite or non-localized space, which is achieved with sound [5]. Visual encoding and semantic relationships can represent data details, patterns or trends, and the integrated whole.

The Gestalt principles of perception play an important role in helping the user identify patterns and relationships. For example, by assigning specific colors, shapes, textures, and sounds to different variables in a data set, it is possible to quickly identify groups of information and patterns represented by similar variables. The design elements, such as colors, textures, shape, and lines, must be distinctly different to avoid confusion and promote quick recognition. It is often possible to use established cognitive and emotional associations with design elements to enhance the communication process. For example, weather maps use colors to represent temperature—blue for cold and red for hot. Spatial mappings of data can define time-based relationships that occur sequentially or simultaneously. These representations may be very specific and show actual times, or they may be relational mappings that show order or simultaneous events. Visuals can depict these spatial mappings through position, transparency, and animation.

Three-dimensional representations of data provide opportunities to use another axis for mapping relationships such as dynamic changes over time and space. The extra dimension enables researchers to view representations from diverse perspectives and detect relationships that might be hidden with a two-dimensional mapping. It is important that the user be able to rotate these three-dimensional models to show all the spatial relationships. Transparent layers and visual coding with different colors, textures, shapes, line widths, or movement can be used to differentiate the layers of information that appear within a three-dimensional model.

Marc Downie [6], an artist and member of OpenEndedGroup, a digital art collective, used the open-source Field programming environment, with accelerated graphics processing and projection, to explore a dark data set and uncover properties in the data relationships. The data was actually the simulation output of the Dragonfly network topology for high-performance computing systems. With the visualization, it was possible to rotate a three-dimensional model and view the complex data representation from diverse perspectives (a video showing this data visualization is available at http://vimeo.com/79674603). Downie conducted this data visualization research at the Curtis R. Priem Experimental Media and Performing Arts Center (EMPAC) at Rensselaer Polytechnic Institute in Troy, New York. The visualization was displayed on a large-scale video wall where, using color, position, and other visual elements, it was possible to identify subtle changes in the data that revealed patterns.

Large-scale displays can be a valuable tool for multisensory data representation. With large video walls, it possible to view data relationships that are difficult or impossible to see on small monitors. The Collaborative Research Augmented Immersive Virtual Environment Laboratory (CRAIVE-Lab) at Rensselaer Polytechnic Institute is a state-of-the-art research space where it is possible to use a large-scale display, sound projection, and physical interaction to represent data relationships. Several projectors, hybrid tracking, and multiple point-of-convergence rendering techniques provide optimal and undistorted views at various distances. A multi-channel audio system and haptic display provide opportunities for multisensory data representation and research in cross-modal perception.

D.L. McGuiness, a leading expert in knowledge presentation and semantic web research, noted that the CRAIVE-Lab helped her visualize extremely large, labeled graphs (Fig. 1) that would have been “essentially impossible to see in perspective without access to a large display. Some people refer to these large graphs as ‘hairballs’ or ‘ratsnests’ that are relatively impenetrable. With environments such as CRAIVE, we can start to explore what previously seemed impenetrable” (personal communication, January 13, 2014).

Fig. 1.
figure 1

Using the Collaborative Research Augmented Immersive Virtual Environment (CRAIVE) Laboratory at Rensselaer Polytechnic Institute to visualize very large networks of data (from research conducted by D.L. McGuinness).

Stanford University has created the HANA Immersive Visualization Environment (HIVE) for collaborative data visualization. HIVE consists of a large video wall with 35 backlit monitors. The wall is 10 feet tall and 24 feet wide with 13440 × 5400 resolution [7]. Up to sixteen users can simultaneously connect to HIVE where they can explore multiple layers of data and zoom into details.

The Software Studies Initiative, led by Lev Manovich [8], used a display system with nearly 287 megapixels of screen resolution called HIPerSpace (Highly Interactive Parallelized Display Space), located at the California Institute for Telecommunications and Information Technology (Calit2) at the University of California, San Diego, to visualize data sets for cultural analytics. The research group developed software that enabled them to interactively display large numbers of paintings, magazine covers, comic book pages, and traversals through video games [9]. Using this state-of-the-art research environment, they were able to identify and analyze changes in artistic style as well as identify other relationships in the cultural data sets (images showing these visualizations are available at http://lab.softwarestudies.com/p/overview-slides-and-video-articles-why.html).

3 Multisensory Design

Despite all these advances in data visualization technology, the visual representation of data is limited by the number of visual characteristics that can be simultaneously assigned to the data without causing confusion or obscuring relationships [3]. By using multiple sensory modalities, it is possible to increase the number of data variables that can be represented in order to highlight trends, patterns, outliers, anomalies, and subtle details that might not be possible to identify with one form of sensory representation.

Different forms of sensory representation have unique semiotic structures which can also complement each other and call attention to relationships that might be missed. Tak and Toet [3] pointed out that multisensory data design can result in “data completion” which enables users to “fill in missing information from one sensory channel with cues from another sensory channel” (p. 559).

Various forms of data visualization have been in use for some time, but data sonification is an area of research that has yet to reach its full potential. Research has shown that the addition of audio cues can enhance the recognition of visual cues [10]. Sound can be both spatial and linear [11]. It can highlight individual elements in data relationships and represent dynamic changes over time. Tonality/atonality, audio progressions, modulation, rhythm, accent, synchronicity, and dissonance can define patterns, subtle changes, outliers, and anomalies. Sound penetrates space and can define relationships with the surrounding three-dimensional environment [11]. Stereo sound can add new dimensions to the perceptual experience by mapping data to multiple spatial parameters. Sound is not limited to the field of vision. It can augment what the viewer is seeing and not interfere with the visual representation of the data.

Rhythm is an important audiovisual design element in multisensory design. Rhythm in lines, forms, colors, and textures, can be used to organize data into visual patterns. Sound also adds strong rhythmic elements through tempo, beat, sound duration, silence, and repetition. In dynamic data sets, where the information changes in real time, rhythm is a unifying element throughout the interactive experience that can highlight changes in both the visual and audio information. Overlapping rhythms can create a counterpoint that also helps the user identify relationships between audiovisual variables.

Cross-modal perception can lead to unique perspectives from the integration of different forms of data representation. Sound and visuals encode information in different ways. The integration of cognitive and sensory information creates a metasyntax that is transmodal [12]. The metasyntax creates a fluid semiosis by defining polysemiotic sensory and cognitive models that transcend the meaning of individual media or actions [13]. Semantic structures overlap and define a new audiovisual semiotics that integrates the syntax of the different media [13].

These fluid semantic structures can lead to new perceptual experiences that highlight relationships or patterns that are not evident in the individual audio or visual representations. In multimodal semiotics, the viewer must construct a system of relational codes to interpret the relationships. The different media create multiple levels of perceptual encoding, spatial and temporal relationships, and cognitive associations. Recursive patterns and layers of audiovisual data define multidimensional arrays that can represent simultaneous and sequential relationships and events [11, 14].

4 Semiotics of Action

With new technologies, the gestures and movements of the viewer also become part of the interactive experience and define a spatial grammar of interaction that integrates the virtual and physical spaces [15]. The sensory and rhythmic dimensions of interaction design enable users to engage in the interactive spatial representation of the data. This type of interaction design, called kinesthetic design, helps the viewer understand the visual and cognitive relationships in the spatial representation of information [5]. Berkeley [16] demonstrated that kinesthetic and tactile experiences shape our perception of space. Psychologists discovered that there is a relationship between physical movement and our visual and sensory interpretations of the space that begins to take shape during childhood [17, 18]. Piaget and Inhelder [17] noted that “spatial concepts are internalized actions, and not merely mental images of external things or events—or even images of the results of actions” (p. 455).

In immersive, interactive spaces such as the Cave Automatic Virtual Environment (CAVE), the orientation of the human body augments the perception of data relationships. Physical interaction is defined by the viewer’s egocentric space, which refers to the orientation and location of the human body in the surrounding space. Rock [19] pointed out that humans use egocentric space to define objects and spatial relationships, such as up/down and near/far, in terms of the position of the body. Gaines [20] noted:

The frontiers of space begin with the body of an individual subject. The physical limits of the body and its means of conscious perception, through sight, sound, smell, taste, touch and the reasoning mind, all engage in identifying the meanings of the things in the world of experience (p. 174).

Just as a performance artist uses movement through space to define a message, gestures, movement, and physical interaction in multisensory data design can help the user understand the spatial and temporal relationships in data representation.

Physical interaction also adds rhythm, tempo, and direction to the syntax of the interaction design. The rhythm of movement and actions combines with the rhythm of images, sounds, and audiovisual transitions in the virtual space. Djajadiningrat, Matthews, and Stienstra [21] pointed out that the “semantics of motion” enables us to understand an interactive experience (pp. 10–11).

Interactivity may also include haptic interfaces that use inertia, force, torque, vibration, texture, and temperature to represent data variables and relationships. Haptic interfaces add another dimension to the semiotics of action by enabling users to interpret spatial relationships through the sense of touch. Palmerius [22] pointed out that “our sense of touch and kinesthetics is capable of supplying large amounts of intuitive information about the location, structure, stiffness and other material properties of objects” (p. 154). He went on to note that haptic feedback can reinforce visual information, provide complementary cues, and help the user find additional features.

5 Multimedia Metadata

5.1 Design Criteria

The multisensory user experience can also extend to database organization. Most interface designs and information architectures are based on sequential hierarchies that are derived from linguistic categories, deductive reasoning, and diachronic logic. Text-based metadata has been the standard for organizing data. Relationships are defined and organized by a priori ontological structures of metadata that make assumptions about data relationships instead of creating an open framework for exploring new networks and connections between ideas. This approach restricts and limits the possibilities for interpreting the information.

For complex databases, we need to develop alternative methods of organizing data that provide flexible ways to search and compare data by using audiovisual information, as well as text. With multimedia metadata, it is possible to define data as multidimensional, spatial, and temporal relationships. Flexible formats for organizing and accessing information support creative data exploration. Multisensory representations of metadata create dynamic semantic models and search methods that enable users to view data from diverse perspectives and identify new patterns and relationships.

Knowledge is also formed by collaborations and collective participation in defining data relationships. The process is fluid and dynamic, so users need access to information management and database systems that support this type of collaboration. The system should allow participants to use different media to define and revise relational models and connections between networks of information.

This collaborative system should be ontologically flat without predefined categories or metadata to avoid assumptions and preconceived ideas about the information and data relationships. Christie and Verran [23, 24] and Srinivasan and Huang [25] highlighted the importance of creating flexible systems that enable users to define relationships. This user-centered approach to data organization should allow users to do the following:

  • employ search methods that use visuals, audio, text, and semantic mapping;

  • create new metadata for groups of information or relationships;

  • modify metadata to reflect new information or relationships;

  • create links between data in different contexts;

  • expand networks of information beyond the local database by including external links to additional data sets and data representations;

  • search for temporal relationships in data; and

  • collaborate with other researchers who can also modify the organization of the information and create new links to data.

Search methods that use diverse media and semantic mapping may lead to a new type of multimedia or cross-modal “meta-language” for searches. The meta-language would enable users to define relationships based on multiple meanings and audiovisual representations of the data. This cross-modal approach to defining searches can lead to new perspectives for defining data relationships for database organization and data analysis. Dunsire, Hillmann, Phipps, and Coyle [26] pointed out that “Without the necessity of defining an ‘authoritative’ or ‘best’ mapping, a metadata element can have more than one set of semantics at the same time; this means it should be a simple matter to move from different but compatible definitions as needed within an application” (pp. 32–33).

5.2 Multimedia Metadata Projects

There are several research projects using multimedia metadata tools that are laying the foundation for using these forms of metadata to organize complex data sets. In Australia, at the University of Queensland, Charles Darwin University, and the University of Melbourne, researchers have designed dynamic, multimedia metadata tools and interactive databases to archive indigenous knowledge traditions. Metadata that use diverse media (text, audio, images) and user-defined information structures are necessary to reflect the fluid relationships that characterize indigenous consciousness and cultural traditions.

Christie and Verran [23, 24] designed a database and file management system called TAMI (Text, Audio, Movies and Images) that is an audiovisual system designed to reflect and perpetuate indigenous knowledge traditions. Researchers in the School of Australian Indigenous Knowledge Systems at Charles Darwin University [27] pointed out that “The database is not a repository of knowledge. It is a digital context for knowledge production. It is work done together within the environment (digital and nondigital) which produces knowledge. The database is ontologically flat, it is the users who encode the relations between objects and the metadata which enriches them” (Description of the Problem, para. 1).

With this type of flexible database organization, researchers can change ontological relationships and discover new connections between ideas and database content. Srinivasan and Huang [25] also acknowledged the importance of using fluid ontologies for archiving indigenous data, and they demonstrated how this type of file management can be beneficial to digital museum archives and other online databases. Their concept of fluid ontologies included “flexible knowledge structures that evolve and adapt to communities’ interest based on contextual information articulated by human contributors, curators, and viewers, as well as artificial bots that are able to track interaction histories and infer relationships among knowledge pieces and preferences of viewers” [25, p. 1]. They noted that knowledge structures should emerge from “the interaction with the very communities that are using the digital museum” and “be truly adaptive and reflective of the priorities and hierarchies of the participants (museum visitor, curator, or contributor)” [25, p. 4].

Srinivasan and Huang [25] developed a project called Eventspace for online exhibits. This project introduced the concept of “metaview.” Users can create a metaview to show how they view the relationships in the database by rearranging the nodes representing the content. These metaviews lead to “snapshots” of the users’ perspectives at specific times, and the different perspectives of the same data set result in the coexistence of “multiple, evolving ontologies” [25, p. 12]. This type of dynamic information architecture for data analysis can lead to new insights and interpretations of data relationships.

These types of interactive, semantic webs for metadata and multimedia databases are important design concepts for large, complex data sets that need to reflect the experiences and perspectives of diverse communities of researchers who collaborate on projects. If researchers are directly involved in the definition of knowledge structures and create interactive models that show where their piece of information fits into the larger context, they can gain new knowledge about data relationships [25]. In addition, a fluid ontology indicates how users interpret information and relationships over time, which adds another dimension to the knowledge structure. Srinivasan and Huang [25] pointed out that our perception of relationships often changes over time with new insights, knowledge, and experiences.

Jane Hunter, head of the ITEE (School of Information Technology and Electrical Engineering) eResearch Group at the University of Queensland in Brisbane, Australia, is assessing the value of semantic annotation systems in next-generation metadata tools. Hunter [28] pointed out the need to develop dynamic knowledge spaces for complex data representations where scholars can analyze and compare data, and attach annotations, citations, reviews, and links to other resources. She also noted that tools are necessary that can track the sources and context for this information and identify redundant, inconsistent, and incomplete information, in addition to indexing, searching, and archiving information [28].

These research projects in multimedia metadata for archiving cultural databases and data management are changing the way we think about data organization and exploration. Flexible, collaborative information management tools for collective data sharing and analysis are essential for data analysis across disciplines, as well as interdisciplinary research.

6 Future Directions

Multisensory data design can play an important role in the representation of relationships in large data sets. However, we need more research in order to develop design guidelines that specify which media are most effective for representing specific data relationships [4]. We also need research in cross-modal perception that helps us understand how we interpret the new syntax that results from the integration of the different semiotic structures in multimedia data representation. We can then determine the best ways to use this additional level of semiosis to represent complex data relationships.

Spatial representation of data presents new opportunities to explore data sets in large-scale, physical environments. However, multidimensional, virtual spaces for data exploration also offer new possibilities for data representation than can expand the way we perceive and process data relationships. Biocca [29] pointed out that “Spatial representation in advanced virtual environments is probably one of the most powerful systems for spatial representation and manipulation ever developed” because virtual reality environments “allow ways to represent, use, and manipulate space in manners that have no equivalent in physical space, and virtual environments represent space with only a subset of cues found in physical environments” (pp. 55–56). However, he also noted that “cyberspace as a design, communication, and cognitive environment remains largely unexplored” (p. 56).

We also need more research on the most effective ways to use multimedia metadata to enhance data analysis and collaboration. Current projects that use multimedia metadata to archive cultural data provide the flexibility that is required for creative data exploration and collaborative research. However, we need to apply these tools to a wider range of disciplines and types of data to determine specific guidelines for using text, visuals, and sound to define metadata for different types of information in large data sets.

Finally, it is important to remember that big data is about narratives. The way data is organized and represented helps us understand those narratives. The senses provide an additional way for humans to connect emotionally and cognitively with those underlying stories and interpret their meaning, thus enabling data to take on social and cultural significance beyond their numerical representations.

The future holds many exciting challenges for researchers and interface designers who work with data analysis and representation. Data visualization and sonification will be augmented by the next generation of tools and innovative virtual environments for multisensory data design that channel our innate abilities to process different types of sensory data.