1 Introduction

Traditional sketching methods include template approach incorporating a transparent drawing surface (i.e. tracing paper) placed on top of the template (Fig. 1), a stencil cut placed on-top of the drawing surface, or carbon paper placed between the template and the drawing surface. An alternative to traditional sketching tools is virtual tracing. Using technology in such context is intimate in nature as it supports activities that are personal and at the same time expand the potential of our bodies by augmenting precision and drawing capabilities of the hand.

Fig. 1.
figure 1

(a) Template sketching aid—the user places a semi transparent paper over the template (b). (b) Printed template on A3 paper.

Virtual tracing is a method of creating physical sketches on paper given a virtual template on the mobile device (i.e. mobile phone or a tablet). The mobile device renders a virtual template image, such as a contour line, onto device screen together with a live video stream of the drawing surface. By looking through the screen, or into a virtual mirrorFootnote 1, the user is able to see the virtual image and the hand holding the pen allowing the user to transcribe information from the virtual image onto the paper (Figs. 2 and 3).

Fig. 2.
figure 2

Static Peephole (SP) virtual tracing method. Every time the mobile device is moved to reveal an unfinished segment of the drawing surface, the user needs to align the virtual template (red line) with what has been drawn this far using touchscreen gestures. (Color figure online)

Fig. 3.
figure 3

Magic Lens (ML) virtual tracing method. The virtual template is projected onto drawing surface by tracking the position of the marker.

Compared to traditional methods, virtual tracing has a clear advantage in that it does not require the physical production of sketching aids, which is particularly problematic when one desires to draw on large formats. In case of virtual tracing, the drawing size is not limited; although, when the drawing surface does not fit on the screen, one needs to move the device in order to reveal the unfinished drawing surface. The core challenge is alignment of the virtual template with what has been drawn thus far. One possibility is to ask the user to manually align the virtual template by dragging the image around the screen using touchscreen gestures. This interaction method is better known as static peephole (SP) [13]. Due to manual alignment drawing is only possible as long as the device is held perfectly still (e.g. on a stand) (see Fig. 2).

An alternative is Magic Lens (ML) interaction paradigm [1]. ML is an Augmented reality (AR) interface where the lens acts as a transparent glass pane revealing an enhanced scene behind the pane. In case of virtual tracing, the ML augments a physical sketch by projecting a virtual template onto the drawing surface irrespective of the position and orientation of the device (see Fig. 3). This has the following advantages over SP: (i) as long as it is possible to track the camera pose in relation to the drawing surface the ML automatically aligns the virtual image with what has been draw thus far; and (ii) as the alignment is done at each rendered frame, the user does not need to keep the device perfectly still while drawing a particular segment and may hold the phone in hand. However, the ML is highly dependent on camera tracking which may diminish sketching experience, particularly as it is difficult to implement robust and accurate camera tracking on a blank drawing surface where the hand holding the pencil can occlude segments of the scene. Additionally, when compared to traditional sketching aids, both virtual tracing methods require the user to look through the phone while sketching and only show a segment of the image being drawn at the time.

The aforementioned opens up interesting questions such as:- (i) How effective are phones in supporting user sketching through virtual tracing?; (ii) Do users find the advantage of the ML useful?; and (iii) Can users’ draw whilst holding the ML in hand? In order to answer these questions, we built a prototype and run a user study with seven participants that drew a contour on an A3 paper using a pencil and three different interaction methods: a traditional template, the static peephole (SP) virtual tracing, and the Magic Lens (ML) virtual tracing.

2 Related Work

Despite the fact that mobile AR apps are available to a large number of users [13] it is still uncertain in which contexts ML can actually provide value to users. For instance, comparative studies of ML and SP interfaces did not identify clear advantages of the ML interfaces for navigation [5] and information browsing tasks [6]. In a similar way, for gaming [8, 10] and information browsing tasks [9, 15], the ML only proved advantageous in certain contexts, such as, when the AR workspace was large [9] or the social setting allowed expressive spatial interaction [10].

In case of information browsing and large document navigation tasks the ability to retain a mental model of the information space is crucial [18, 19]. This is not the case for in-suit sketching where the most important part of the task is user’s ability to relate digital information to the real world (e.g. transcribing information to the real world such as digital instructions/routes to paper maps or virtually tracing a character on a blank paper, or where tangible objects are placed at instructed positions within AR workspace (e.g. AR chess with physical pieces)).

The ability to relate augmented information to the real world was previously studied in [3]. The study tasked users to find and tap on an augmented-reality target (without seeing one’s hand in a rendered scene) that could help as a method of establishing common ground between what was shown on the screen and the real world. The results show that when the AR workspace is complex users require on average more than 6 s to complete this task.

ML has proved popular for supporting sketching (e.g. [11, 14, 16, 17]). However, this body of research predominantly focused on complementing physical sketches and not on supporting in-situ sketching through virtual tracing, which is the focus of this paper. Our two recent studies looked into virtual tracing. The first one is a preliminary study presenting and comparing a dual-camera magic lens with ML and SP interface with 6 participants [4]. The results show that virtual tracing with dual-camera magic lens is feasible, has higher perceived satisfaction score and is faster when compared to ML and SP. However, dual-camera ML system requires marker placement above the drawing surface, limiting where such a system can be used.

The second observational study with three participants explored how depth distortion affects 3D virtual tracing (e.g. virtual tracing a contour to a cup) with a 3D virtual tracing prototype [7]. Drawing performance in the study exceeded authors’ expectations suggesting depth distortion, whilst holding the object in hand, is not as problematic as initially predicted. Although, when the object was placed on the stand and drawing was performed with only one hand (the other is used for holding the phone) their performance drastically decreased.

3 Virtual Tracing Prototypes

In order to evaluate a mobile phone as a virtual tracing aid, we implemented two virtual tracing prototypes, namely: SP and ML. For achieving the stability of a mobile device when virtually tracing with SP we built a height adjustable stand (see Fig. 2). In case of ML, there is no such requirement, as the alignment is done automatically at every rendered frame. This is achieved by tracking the camera pose in relation to the drawing surface by adding a marker on Fig. 3. This was done to ensure that there were always sufficient features for camera pose tracking which was implemented using the VuforiaFootnote 2 library. As more of the contour is drawn, it could be possible to replace marker tracking with contour tracking systems [11, 12]; however, such tracking systems are prone to failure if contour is occluded.

Even markers can be occluded, which can be avoided with using multiple markers or a marker that can be moved around. To avoid covering the whole drawing surface with multiple markers we opted for a second solution. Participants move the marker when it is in the way of the pencil or when the marker is no longer visible within camera field of view. Every time participants move the marker they have to manually align the virtual template with what has been drawn this far.

Due to the fact that the virtual template is projected on top of the drawing surface, the virtual template overlays anything that exists on drawing surface (e.g. pencil markings). Current system is not able to detect and remove the part of the virtual template that has been drawn thus far. In order to mitigate this effect, we made the virtual template semi transparent and allowed the user to adjust transparency level.

4 Methodology

The experiment presented is a within-subjects design with interaction as an independent variable having one of the three values, namely: (i) Template; (ii) Static Peephole (SP); and (iii) Magic Lens (ML). The size of the drawing surface was set to A3 paper in landscape orientation placed on a desk. Whilst completing the task, participants sat at the desk.

4.1 Interaction Modes

In template mode the user placed an A3 printed template below a sheet of tracing paper and drew the contour as shown on Fig. 1.

In the case of SP the phone was placed on a height adjustable stand (15–25 cm) as seen on Fig. 2. In order to align the virtual template with what has been drawn already, the user utilized two touchscreen gestures: (i) drag-and-drop for panning; and (ii) pinch for resizing the virtual template.

In contrast to SP, the ML mode does not mandate placing the phone on the stand while drawing. Thus, a decision was made to remove the stand even though this may have placed ML at a disadvantage within this test case. This was done because in real world use removing the need for a stand was considered an important advantage as it increases the portability of the system. Depending on the stand type, it might also affect the flexibility of the ML interaction by placing restrictions on phone’s movement. Finally, we were also keen on exploring if users’ performance drastically decreased when performing 2D virtual tracing with one hand (e.g. the other is used for holding the phone), as was reported in case of 3D virtual tracing [7].

4.2 Participants and the Tasks

We recruited seven male participants aged between 23 and 45 (3—employed, 4— students). The recruiting was based on convenience sampling at the department of computer science (3 participants) and within social circles of the authors (4 participants). Participants came from various backgrounds, such as: nursing (1), architecture (1), computer science (3), medicine (1) and mechanical engineering (1). All participants knew the term augmented reality and have previously used AR systems on a mobile phone.

Participants were tasked with completing the partially finished contour drawing of a cartoon character (Fig. 4a) using a pencil. The drawing was partially finished to observe how users manage to precisely align virtual template to a pre-drawn segment and to ensure all participants draw a character of maximum size that will fit on A3 paper.

Fig. 4.
figure 4

(a) Partially finished contour drawing of a cartoon character participants were tasked with. (b) Finished contour drawing.

Participants were asked to complete each task as quickly and as accurately as possible. In total 3 drawings were produced by each participant. Each drawing featured a different cartoon character and was drawn with a different interaction mode using a 2B pencil.

4.3 Data Collection and Experimental Procedure

We predominantly focused on qualitative data collection methods utilizing questionnaires, observational note taking and video analysis. Although, we also timed the tasks. As all participants were familiar with tracing using a physical template, they started with this task followed by two virtual sketching tasks (randomising SP and ML). Before each virtual tracing tasks, we demonstrated how the prototype works and users started the task without additional training or guidance. The assignment of contours was also randomised. The character contour assignment and the order in which interaction modes were tested are counterbalanced. After completing all three tasks, the user completed the questionnaire.

The questionnaire started by estimating participants’ perceived satisfaction utilizing the “overall reactions” section from the Questionnaire for User Interaction Satisfaction (QUIS) [2]. In the second part, participants were asked: (i) to rank interaction modes from best to worst and justify their decision; (ii) if they would use the stand in ML mode if one was available; and (iii) to highlight the most difficult part of each task and make suggestions for improvements.

Fig. 5.
figure 5

The figure shows an example of the quality of drawing by overlaying the virtual template over the hand drawn layer.

5 Results

Due to a small number of participants, detailed analysis of result significance was not possible; thus, we present results using only descriptive statistics. Although, these results are of preliminary nature, they clearly show trends worth presenting.

5.1 Drawing Quality and Task Time

By overlaying virtual template contours over produced drawings (Fig. 5), two researchers independently and subjectively compared the quality of all three drawings for each participant and ranked them from best to worst. As expected the template scored best, followed by ML and SP (Fig. 6d). However, the results did not highlight any obvious deviations in obtained rankings. The task time results (Fig. 6c) showed that template mode was on average more than twice as fast compared to SP and the ML mode which achieved comparable task time results.

Fig. 6.
figure 6

(a) QUIS scores [1–9]; (b) Ranking results for preference (smaller is better); (c) Average task time completion in minutes; (d) Ranking results for drawing quality (smaller is better).

5.2 Questionnaire Results

QUIS results show template mode produced highest scores across all properties, whereas, SP and the ML obtained similar scores (Fig. 6a). In case of preference ranking, a similar pattern repeats, template mode ranking best and SP and ML achieving a similar rank.

Perhaps surprisingly, when participants were asked to name the most difficult aspect of the task, none said they found it difficult to look through the phone while tracing. Instead, most reported difficulties linked to manual alignment. In SP mode manual alignment was required every time the phone was moved, whereas, in ML mode manual alignment was required every time the marker was moved. Even though fewer alignments were required in case of ML mode, participants reported this difficulty for both modes.

When asked what should be changed in case of virtual tracing sketching aids, beside solving manual alignment, participants also highlighted general dislike of the marker and the fact that they had to avoid occluding the marker for the system to operate.

One participant proposed to modify the system so that it will be able to remove parts of virtual template that had been already drawn because this would make it easier to “know how my drawing really looks and what I still need to draw”. Additionally, in case of SP mode, participants expressed the need for an extra way to overview the whole virtual image. When participants were asked if they would use the stand in ML mode if available, 3 out of 7 participants said they would, sighting tiredness as the drawing took on average 10.7 min.

5.3 Observational Results

Observations of the distance between the phone and the drawing surface showed participants adjusted the stand or held the phone at 17–22 cm. Participants always looked at the paper through the phone keeping the pencil within the cameras’ field-of-view (FOV). This was also observed in situations when it was obvious how to fill in the missing segment of the drawing surface and completing the drawing using virtual tracing required the additional user effort of aligning the virtual contour with what has been drawn this far. Hence, whilst virtually tracing the pencil never crossed the boundary between the phone and the surrounding context although users did look at the paper to see how they were progressing (three participants were observed to do this several times).

The above suggests that covering the drawing surface with a virtual template had some undesirable effect. One such effect is not knowing exactly how the produced drawing looks like, because it is occluded with semi-transparent virtual template whilst at the same time making it more difficult to know what segment of the drawing still needs to be finished. A comment relating to this was made by one participant (see Sect. 5.2).

Another interesting observation relates to the unique strategy developed by participants in SP mode. They always drew all instructions within the screen segment, including those at the very edge of the screen, before they decided it was time to move the device to the new area. This behaviour was not observed in the ML mode where participants moved the device as the drawing progressed.

6 Discussion

The results show that the drawing quality achieved by template method did not produce an evidently better score when compared to ML and SP; however, the template mode still scored best. This can be linked to experiment design, which asked participants to complete the task as quickly and as accurately as possible. In template mode participants completed the drawing more than twice as fast, hence the speed with which participants completed the task may have reduced the quality of the drawing. As the template mode was by far the fastest, with highest QUIS score and best quality ranking results, we can conclude that template mode performed best on our test.

6.1 How Effective Are Phones in Supporting User Sketching Through Virtual Tracing?

Even though virtual tracing with mobile phones took twice as much time compared to traditional tracing, participants were able to complete all tasks and achieve comparable drawing quality to template mode. This allows us to conclude that it is possible to use mobile phones as in-situ sketching aid such as a virtual tracing aid. This is also supported by the fact that none of participants found it difficult to look through the phone while virtually tracing, even though the task completion time took more than 10 min on average. These finding are also in line with our previous study [4].

6.2 Virtual Tracing Whilst Holding the ML in Hand

All participants managed to complete all virtual tracing drawing tasks with comparable drawing quality. This includes those where the phone was placed on the stand (e.g. SP mode) or held in hand (e.g. ML mode). This outcome is not in line with results from our 3D virtual tracing study [7] where placing a 3D object on a stand and holding the mobile phone in hand drastically decreased user’s ability to trace draw. Holding an object in hand may have had several advantages, amongst others, the sense of proprioception—sensory input about where one hand (and its fingers) is positioned in relation to the other hand (and its fingers) which may lead to better depth perception [21]. Our results suggest that proprioception does not play such an important role in case of virtual tracing on 2D surface; however, this may be different if one would focus only on achieving the highest possible drawing accuracy.

6.3 Magic Lens vs. Static Peephole

The study results are promising for ML, even though they position ML at an equal footing to SP mode. ML performed similarly to SP even though the users had to hold the phone in hands while sketching which increases the possible usage of such a system in real. However, the fact that participants showed general dislike towards the marker and the fact that they had to avoid occluding the marker in order for system to operate suggests that such tracking diminish sketching experience. Hence, building a ML system where these limitations are not present is bound to significantly improve ML performance. This is in line with our previous study where such a system (dual-camera magic lens) was built and revealed it has potential to be both faster and lead to a higher perceived satisfaction compared to SP and ML [4].

6.4 Understanding the Information Space

Only in SP mode participants expressed the need for an overview of the virtual image. Contrary to ML mode, in SP mode gaining an overview of what needs to be drawn is difficult because in SP mode participants had to stop drawing and use dragging and zooming gestures to explore the wider context. This action broke the alignment between the virtual template and the drawn contour, hence, every time this was done the user had to manually realign before virtual tracing could resume. In ML this was not a problem. Users moved the phone in order to explore a wider context during which the alignment of virtual and drawn segment was maintained.

6.5 Crossing the Boundary and Manual Alignment

Observations also show that participants spent most of the time looking at the paper through the phone’s screen whilst keeping the pencil always within camera’s FOV. We hypothesize that there are two reasons for this: (i) the dual-view problem when the observer’s perspective does not match the perspective of device camera [3]; and (ii) multiple disparity planes because the drawing surface and the phone screen lay at different distances. Both make it difficult for users to simultaneously look at the phone and the surroundings.

Manual alignment was identified as the hardest part of the task; hence it is not a surprise that users tried to minimize the number of alignments. This was achieved by drawing to the very edge of the phone before moving to a new area. However, even though it was possible to capture a wider segment of the drawing surface by moving the phone to a greater distance, reducing the number of manual realignments and the need for an overview, participants adjusted the stand to a distance ranging between 17 and 22 cm. This was considered a comfortable viewing distance for the setup.

7 Conclusion and Future Work

Traditional sketching aids rely on the physical production of templates or stencil which can be limiting and time consuming, particularly in the case of larger formats. The alternative is virtual tracing using a mobile phone. We have evaluated and compared three different interaction modes (a physical template, Static Peephole (SP), and Magic Lens (ML)) by running a user study with seven participants in which participants attempted to draw a cartoon character in each mode.

The results show that (i) traditional template mode is the fastest mode with highest perceived user satisfaction and best rank, (ii) it is possible to use mobile phones as in-situ sketching aids, (iii) contrary to 3D virtual tracing [7], 2D virtual tracing is possible whilst holding the phone in hand, and (iv) that only in SP mode, participants expressed the need for a feature that will allow the to understanding the wider context (e.g. minimapFootnote 3). Finally, the results suggest that currently available tracking system diminish the ML sketching experience, hence future systems should aim to find camera pose tracking solutions that: (i) avoid requiring manual alignment; (ii) avoid marker use; and (iii) enable participants to occlude desired segment of camera’s FOV without causing system failure.

In addition to the aforementioned, future systems should look into ways of detecting what has been drawn thus far and update virtual template to only augment the paper with what remains to be drawn. Due to a small number of participants, statistical analysis was not possible, hence in the future a greater number of participants should be recruited to complete the study. Moreover, a study exploring the use of the stand in the context of the ML interaction mode should be carried out. Finally, future studies should explore virtual tracing using virtual mirror and look into ways of supporting sculpturing practices.