Keywords

1 Introduction

With the rise of electronic books (e-books) and the Internet bookstore such as Amazon, the number of brick and mortar bookstores is decreasing [4]. The merit of using an Internet bookstore is that we can easily order books that we have already decided to buy. However, physical bookstores provide us with the possibility of discovering new books. Therefore, we believe that Internet bookstores and traditional bookstores should coexist. In this paper, we propose Zapzap, which supports a user in discovering new books in a bookstore, by presenting the rough contents of a book, thereby enticing the user into buying the book.

Fig. 1.
figure 1

A user using Zapzap.

2 System Overview

Zapzap is a table-top device and comprises a RGB-D sensor and two displays, called the top display and the side display. The top display includes a touch sensor. We used ELO 3230L touchable monitor for the top display, a Philips BDM3201FC/11 monitor for the side display, and Intel Realsense F200 camera for the RGB-D sensor. Figure 2 shows a schematic of the device.

Once the user places a book on the top display, the system recognizes the book via the RGB-D sensor and then displays keywords, i.e., words often included in the story, around the book. Simultaneously, the side display shows the blurred cover of the book so that users know that the book has been recognized by the system. The user can tap any of the words shown on the top display. On tapping, the system shows a random sentence that contains the keyword.

Fig. 2.
figure 2

Schematic of Zapzap.

2.1 Book Recognition and Keyword Positioning

To display the keywords of the placed book, we need to identify which book is on the table. It is also important to track its location to layout the keywords around the book. We used AlexNet [2] to identify the book cover. To prevent flickering due to temporal misrecognition, we designed the system to accept the output from AlexNet only when it has recognized the same book for three consecutive frames.

For book tracking, we used the depth image obtained from the RGB-D sensor. As the sensor is placed on the display, the surface of the book is closer to the sensor compared with the display surface; therefore, we can assume the centroid of the closer pixels as the center of the book. We only extract pixels that are closer to the sensor but limit extraction within 3 cm of the top display so that the user’s hand can be maintained over the display without affecting tracking.

2.2 Extracting Keywords

We extract the keywords from the full text of the electronic version of the book. We morphologically analyze the full text and then pick the top ten common and proper nouns. We show an example of the extracted words as follows. The words with (n) are character names, (l) indicates a location, and (b) indicates a brand. This method tended to preferentially extract character names.

Ikebukuro West Gate Park Footnote 1

  • Kana (n), Ikebukuro (l), Makoto (n), Takashi (n), Shun (n), Kyoichi (n), Isogai (n), Yamai (n), Kazunori (n), Kenji (n)

Mr. Mercedes (part 1) Footnote 2

  • Brady (n), Mercedes (n/b), Pete (n), Paula (n), Handley (n), Barbara (n), Frankie (n), America (l), Toyota (b)

Ikebukuro West Gate Park Footnote 3

  • Kana (n), Ikebukuro (l), Makoto (n), Takashi (n), Shun (n), Kyoichi (n), Isogai (n), Yamai (n), Kazunori (n), Kenji (n)

Mr. Mercedes (part 1) Footnote 4

  • Brady (n), Mercedes (n/b), Pete (n), Paula (n), Handley (n), Barbara (n), Frankie (n), America (l), Toyota (b)

2.3 Placing Keywords

When displaying the keywords, we decided their placement by applying a force directed graph [1]. We considered a graph with all the words and a virtual middle point, which represents the position of the book, as its nodes. Then, we set the links between all the nodes. We set the strength of the links to \( K_{ww}> K_{ww} > K_{wc} \), where \(K_{ww}\) is the strength of the links between each word and \( K_{wc} \) is for the links between each word and the virtual middle point. This causes the words to be arranged such that they spread around the middle point. In addition, we added gravity, which forces the words to gather around the center of the display so that the words is not placed outside the display even when the book is placed on the corner. Note that we indicate the virtual middle point to be where the book is placed to help with understanding, even though the book is not actually visible.

Fig. 3.
figure 3

Examples of keyword layouts.

Figure 3 shows an example of the layouts. Figure 3a shows the layout when the user places the book to the center of the display, while Fig. 3b shows the layout when they place it at a corner of the display.

3 Related Work

Murai and Ushiama proposed an interface to support the effective browsing of e-books by users [3]. Their system displays an attractiveness map, which visualizes the estimated transition of a user’s interest through the story of the book, thereby allowing users to know where to first look in the book. While their interface aims to be used in a computer display to browse e-books, our interface aims to be used in physical book stores to browse actual books.

4 Conclusion and Future Work

We presented the concept, prototype design and implementation of Zapzap, a table-top device that shows the rough contents of a book to support a user browsing books in a physical bookstore. The system recognizes the book and then presents the content as keywords and sentences to avoid ruining the entire story. In a future study, we plan to extract more attractive keywords compared to the current method, i.e., the most frequent words. To achieve this, the attractiveness map proposed in Ref. [3] may be applicable. Moreover we plan to show additional information gathered outside the book such as the author information or online reviews.

After making these improvements, we will conduct an experiment to determine how informative the system is for the user and whether it motivates the user to read or purchase the books.