Keywords

1 Introduction

With the massive amount of information available on the web, search engines are an instrumental tool in being able to find items in this haystack. The search technology has evolved significantly since the early revolution of Google search methods [Goog98a, Goog98b], yet the way the information is presented has changed very little. The most common method for displaying search results is still a linear, ordered, static list of items that match the query. There have been many suggestions for alternative representatiton methods [UIbo09, Huma11, GLO14, Comp14], but none have prevailed in the general search arena. A few of these methods found their way into specific domains, for example the representation of image search results, the representation of recommendations on specific domains such as movies and shows, related consumer items, or music. In these domain-specific cases, certain key features, such as genre and rating in the case of movies, are used to cluster the search results. Still, these methods assume that users know what they are looking for, and can define it well enough using the search terms and mechanism.

In this work we address the cases where users cannot specify exactly what they are looking for, but rather have only a vague idea, and would like to explore some results in order to refine the search process. We term these interactions exploring the search space, and exploring tradeoffs.

Let us use two specific examples to make the above difference clearer. In the first case, a user entered a search query “Napoleon birthday”. The user specified well the search terms, and depending on the search engine used, the user may get a ‘likely answer’ at the very top of the list, followed by an indexed list of plausible references.

In our second example, a user is trying to choose a movie to watch. The user prefers it to be ‘not too long’, preferably a comedy, and would like it to be ‘recent’. The terms used above already indicate that there are tradeoffs to explore. How much time is ‘not too long’? Or how long ago is considered ‘recent’? And would a 90 minute movie from 2 years ago be preferred over a 140 min movie from last month? And, though the user prefers comedy, what about a short drama (30 min) and was released only yesterday: would that be a better choice?

To address these issues, we introduce a new way to represent the results, which allows the user to interactively explore various tradeoffs in the search space. We describe the concepts, some of the design decisions, and present a working system implemented in Haskell on the Web.

2 Representation as a Network Graph

Network graphs, also referred to as Nodes-and-Link diagrams, have been used in many cases to represent search results [UIbo09]. In these graphs, search terms by the user can be represented as Key-Nodes, and the resulting search records are Record-Nodes connected to these. Clustering of the results is common, in which similarly featured records are displayed in spatially close locations [Stre05, Clas06].

The network graph representation holds, admittedly, a few hurdles to wide adoption. The first is its inability to scale comfortably to large result sets. Once there are too many nodes to display, the user has to investigate deeper to get relevant information. Various methods to deal with this issue have been offered (see [UIbo09], Chap. 10), but all require additional effort on behalf of the user.

Another issue is the connection of the results to the Key-Nodes, or the search terms. In traditional search approaches, each record is naturally connected to each of the Key-Nodes. For example, if one searched for ‘Not Too Long’ as one of the criteria for selecting a movie, every resultant record would be connected to a ‘Movie Length’ Key-Node. This does not convey real information. One of the ways to try and mitigate this is by grouping (clustering) of the records according to their length, but this has only limited benefit.

In this work we address these two issues in multiple ways. As for dealing with the possible large number of search records, we limited the number which are presented to the user each time (depending on the size of the screen). Each time the user elects to eliminate one of the records (as not a good fit), the next found record is brought into display. In addition, to avoid overloading of details within the graph, the records are labeled with numbers, and each number is described in more detail in a text box on the side of the screen. More details for a record can be retrieved by double-clicking it. For the second issue, that of all records being connected to all Key-Nodes and thus not conveying any useful information, we presented a natural sub-division of Key-Nodes, or search terms, into sub-Key-Nodes. Thus, for example, a ‘Not Too Long’ query will translate to one Key-Node labeled as ‘Length’, with three sub-Key-Nodes labeled ‘Short’, ‘Medium’, and ‘Long’. Each record will be connected to one of these sub-Key-Nodes. This, together with clustering, affords the user a much faster grasp of the search space.

3 User Interaction

The user may wish to interact with the search query itself, or the results, in multiple ways. When defining the query, the user may wish to define what they mean by ‘Not Too Long’ of a movie. In various applications, sliders and range-values are used to define such values. These values are then used as filters to remove non-relevant data from the search results. This, in a sense, removes the ability to explore the tradeoff space. In this work, we allow the user to specify similar ranges. However, these specifications are not used as hard-limiting factors to filter the search results, but are rather used to specify the Key-Nodes (and sub-Key-Nodes), and the user can still explore changing these interactively and view their influence on the graph, clustering, and records displayed.

After defining the query, and viewing the results, we allow the user to assign different weights to the various Key-Nodes. Thus, it might be that the length of the movie is more important than the genre for the user. By changing the weight of the different Key-Nodes, the clustering and appearance of the graph (and possibly even the records displayed) would change. This is achieved by clicking on either a Key-Node or a sub-Key-Node.

4 Implemented System

One of the main concerns with the implementation is to ensure that the graph, and the representation, is not changing too dramatically with any user interaction. This is very important to the user’s ability to remember the graph results and understand them [Memo12]. To address this, we implemented smooth transition between states of display, so the user has context when things are moving around. In addition, unlike force based methods which are based on running simulation/model to determine the new location of records, in our implementation the location of records (and Key-Nodes) is determined analytically. Thus, we can guarantee continuity of location as a function of, for example, the weight of the various search terms.

The system itself was implemented on various platforms (Matlab, Haskell+Web, d3.js) to experiment in different ways. A demo system will be shared in the talk.

5 Future Directions

As with any user-centered application, the main test is receiving user feedback. To this aim, we are planning to make this system available online for users. The system will initially be tailored to specific domains (movie selection, job search, vacation search, etc.), where tailoring the Key-Nodes and settings are easiest.

Another important direction is adapting this concept to small screens. Many of the search interactions today are done on mobile devices with limited screen space. Presenting graphs on small screens presents another challenge.

Figures 1, 2, 3 and 4 are from an earlier version of the interface, but serve to demonstrate the impact of changing the importance (or weight) of Key-Nodes on the layout of the resulting graph.(Video and demo available).

Fig. 1.
figure 1

Year \(54\,\%\), Length \(13\,\%\)

Fig. 2.
figure 2

Year \(38\,\%\), Length \(30\,\%\)

Fig. 3.
figure 3

Year \(30\,\%\), Length \(38\,\%\)

Fig. 4.
figure 4

Year \(13\,\%\), Length \(54\,\%\)