Phylter: A System for Modulating Notifications in Wearables Using Physiological Sensing
- 2.6k Downloads
As wearable computing becomes more mainstream, it holds the promise of delivering timely, relevant notifications to the user. However, these devices can potentially inundate the user, distracting them at the wrong times and providing the wrong amount of information. As physiological sensing also becomes consumer-grade, it holds the promise of helping to control these notifications. To solve this, we build a system Phylter that uses physiological sensing to modulate notifications to the user. Phylter receives streaming data about a user’s cognitive state, and uses this to modulate whether the user should receive the information. We discuss the components of the system and how they interact.
KeywordsfNIRS Adaptive interfaces Brain-computer interfaces Google glass
Wearable technology such as Google Glass has the capability to capture and deliver information to its user with an immediacy that surpasses the current generation of input/output devices. Combined with Google Now, Search, and a rich store of personalized, situation-sensitive data, the capability to swiftly display information enables a new genre of consumer grade human-computer interaction, where the computer ceases to be the focus, and instead becomes an inconspicuous assistant to ordinary activities. But this immediacy and amalgamation with the user carries a steep price when the bond is broken with an untimely interruption.
Delivering information at the wrong moment or delivering the wrong information for the user’s current situation can disrupt work or social interactions, and exacerbate the very problems that wearables such as Glass might solve. Muting all notifications is an option, but frequent manual muting and unmuting would itself be disruptive. Instead imagine a “magic” knob that is driven by the moment-to-moment measured interruptibility of a user. As such, Glass may attain the sweet spot of being actively informative without being overly obtrusive. It would deliver notifications to users precisely when they have the time and capacity to perceive them. For example, it would be capable of prioritizing incoming emails or social network messages to present the most important items first and defer others to a more opportune time; it could also summarize or suppress lower level detail display such as those on a map when it detects that the user is busy. However, high-priority notifications that might be important regardless of state can bypass this filter and be shown to the user immediately. In a dual-task scenario, this system can let the user focus on a primary task and only interrupt the user to work on another task when it detects the user can handle the additional effort.
This paper introduces a physiological-based notification filtering system, Phylter, that sends pertinent notifications to a user only when the user is in the proper cognitive state to handle additional information. The system uses physiological sensing as a means to time, suppress, and modulate information streams in real time. We posit that functional near-infrared spectroscopy (fNIRS), a lightweight brain-monitoring technology, has promise to control this framework because of its access to measures of blood flow in the brain, an overarching barometer of the user’s level of cognitive workload – the degree to which present engagements have posed computational demands on short term working memory. Using machine learning algorithms trained to distinguish known instances of high cognitive workload and low cognitive workload exclusively from fNIRS data, our brain-augmented Glass prototype distinguishes the neural signature of its user’s state of short-term memory cognitive workload and applies this knowledge to capitalize on the most opportune moments to deliver information.
2 Related Work
2.1 Short-Term Memory Workload and Interruptions
In the age of mobile computing and social media, interruptions from e-mail , instant messages , and other services which are granted unsupervised access to the browser, cell phone, or wearable computer’s output threaten to destabilize a user’s ability to focus on a singular task. In one study by Bailey and Constan, multitasking participants reported twice the anxiety, committed twice the number of errors, and required up to 25 % longer time on a primary task when interruptions arrived during rather than in-between tasks . To mitigate the costs of interruption, research has explored a breakpoint-based method for mediating notifications in which statistical models infer likely points of transition between tasks and schedule notifications for these moments . But this method requires complete knowledge about all of the user’s concurrent tasks, and assume that the sum of the user’s cognition can be understood within the digital environment. The second assumption, in particular, loses validity in a wearable computing context – where the interface is no longer the main object of the user’s attention. Horvitz and and Apacible devised a mathematical model of the cost and utility of interruptions, but their model assumes knowledge of the attentional state of the user and the the utility of interruption .
In cognitive science, working memory refers to the mental resources dedicated to storage, retrieval, and manipulation of information on a short timescale – measured in seconds, not minutes. It is involved in higher cognitive processes such as language, planning, learning and reasoning . Some of the most popular models of working memory [29, 39] posit that the system operates under severe constraints with competition for the limited pool of resources for the numerous tasks that might at any moment engage it. A task pushing the upper-bound of working memory’s phonological loop (the working memory component engaged by subvocal mental rehearsal) may not directly undermine the processing done by the visuospatial sketchpad (another component for visual simulation and recall), but, drawing from a common pool of computational resources, two simultaneous working memory tasks nonetheless limits overall performance. Cognitive workload is dependent on the characteristics of the task, of the operator, and of the environment . Working memory and executive function engage areas in the prefrontal cortex, and the amount of activation increases as a function of the number of items held in WM .
Research has explored the interruptibility of a user through physiological input such as heart rate variability and EEG  and pupil dilation [4, 16]. In these studies, the physiological sensor is calibrated to detect cognitive workload, as it has long been acknowledged that moments of low cognitive workload present the most opportune time for interruptions , in part since workload diminishes at task-boundaries . Tremoulet et al. found that by queuing questions and alerts until the user is in a state of low workload (measured by EEG, heart rate, and galvanic skin response), they could increase the number of tasks that the user could complete, reduce error rate, and also decrease decision making time for the interrupting alert tasks .
2.2 Passive Brain-Computer Interfaces
Passive physiological interfaces portray the user’s present state of mind without continuously involved human effort. As such, they can supplement direct input with implicit input (derived from a physiological sensor attached to the user) and apply gleanable information to trigger adaptations that aid the user’s short-term or long-term goals. When the underlying physiological interface measures brain activity, these systems, known as passive or implicit brain-computer interfaces (BCIs), benefit the user by deducing state without additional effort on their part. In contrast to the much wider usage of brain sensors in active BCIs (where the user consciously manipulates mental activity in order to trigger an intended command) , passive BCIs support a practical defense mechanism towards inevitable physiological misclassifications and the small-but-not-negligible lag-time between the physical manifestation of a thought and the deliverable command .
In controlled experiments, real-time passive BCIs have proven to yield measurable improvements to users’ performance compared to static counterparts. Prinzel et al. used EEG signals to modulate levels of automation in simultaneous auditory and hand-eye coordination tasks  and Wilson and Russell used an EEG engagement index to decelerate UAVs or present alerts depending on what would most effectively sustain the user’s focus. Stripling et al. built a system where an operator could create rules for the user’s physiological state that triggered pre-recorded macros in order to manipulate a virtual environment when physiological conditions were met . Recently, real-time adaptive systems have used passive fNIRS input to modify robot autonomy , control a movie recommendation engine , and modify the number of UAVs for an operator .
As an alternative, functional near-infrared spectroscopy (fNIRS) measures blood-oxygenation levels in neural tissue as deep as 3 cm. The technique relies on the fact that infrared light penetrates bone and other tissues but is absorbed and scattered by oxygenated and deoxygenated hemoglobin. Conveniently, the optical properties of oxygenated and deoxygenated hemoglobin differ, and so, the relative proportion of the two can be deduced by the infrared light returned to the detector . It measures the same blood-oxygenation level-dependent signal as fMRI , but only measures the part of the brain where the sensor is applied. FNIRS has high spatial resolution, but because the changes in blood flow take several seconds to reach the brain, fNIRS is not suitable for direct input. Instead, fNIRS can portray more stable trends in the users mental state, and can be used to distinguish workload levels [9, 12] or multitasking [2, 33].
In many cases, fNIRS continues to provide moderate descriptions of its wearer’s brain even when the user is in motion. Head movements, heartbeats, and respiration can be corrected with filters, and standard computer-interactions like typing and clicking do not interfere with the signal [10, 19, 32]. FNIRS sensors consist primarily of multiple infrared light sources (at two wavelengths to detect oxygenated and deoxygenated hemoglobin) and detectors, usually attached to a processing unit by fiber-optic cables. With advancements in signal processing, microelectronics, and wireless communications in recent years, fNIRS has become portable, supporting light sources and detectors in a self-contained unit. These wireless devices can accurately measure activity while users are performing real-world tasks such as running  or bicycle riding , with only slightly higher error rates than a traditional clinical device . Feature selection can be used to improve the efficiency and accuracy of machine learning algorithms translate fNIRS signals to classifications in real time .
3 System Design
3.1 Physiological Input
Because individuals have different values of raw physiological signals, a back-end engine creates a machine learning classifier for each individual using our system, and then feeds real-time data into this model. We used the online fNIRS analysis and classification (OFAC) tool, shown to produce real-time classifications of fNIRS data with high accuracy . Participants first complete trials of a task that stimulates known cognitive states – reference points that can later be used to determine the user’s state when the ground truth is otherwise impossible to gauge. For example, participants might complete trials of the n-back task , generating multiple labeled time-series which, after being described in terms of appropriate statistical features, serve as instances to the open source machine learning library LIBSVM. Trained on both high workload and low workload instances, LIBSVM ultimately allows for rapid binary classification on a moving window of time-segments in real-time. This system has been used to adapt a scenario where an interactive human-robot system changed its state of autonomy based on whether it detected a particular state of multitasking , measure preference signals to control a movie recommendation engine , and expand the motor space of high-priority targets in a visual search task .
The client receives a continuous stream of machine learning classifications in a string format. Each classification comes in the form of a colon-delimited string, containing the first letter of the most probable prediction as well as the associated confidence value of it and the other (potentially numerous)possibilities. Based on Afergan et al.’s  method of triggering adaptations from a moving window of the most recent confidence values, we store a running confidence of each classification over a user-defined period of time (typically 5–20 s). Less sensitive to erratic swings in classification, the sliding window provides a more conservative estimate of the user’s state, as a small number of misclassifications will not necessarily provoke incorrect adaptations, an important design principle to mitigate negative effects of BCIs .
3.2 Notification Input and Output
Phylter can process notifications from an email server, a messaging service, or a custom application as long as it adheres to a basic string or XML format and includes a marker at the beginning of the packet specifying the level of notification. It handles three levels of notifications: never send (only useful for archival or experimental purposes), always send for high-priority notifications, and adaptively send for physiological-based filtering. It displays and logs when it receives notifications so that whatever system utilizes the service knows whether or not the user has received a message.
As a prototype of the wearable device that ultimately receives the message, we built a custom message handler for Google Glass that receives notifications and displays them for a set period of time before clearing the screen. If running as a background service, the application can turn on the screen to display the message, and then deactivate the screen once the notification ceases to be relevant. The message handler is built on a simple shell script that can be customized for the protocol of other wearable devices.
3.3 Server Architecture
The core functionality of Phylter relies on client-server architectures. Phylter runs two concurrent threads to receive information over TCP/IP, opening separate ports for physiological input and notifications. It acts as the server in these connections so that it can handle multiple sources for each type of input. Every time Phylter receives a new physiological classification, it updates its running average of the physiological state by discarding the oldest classification and adding this new data point. When it receives a notification marked as adaptive, it checks the user’s physiological state, and sends the notification to the wearable device via an Android device bridge communication channel triggered by a shell script.
3.4 Data Logs
Phylter records a detailed, timestamped log of its activity in plain text. It saves (in separate files) a record of all of the physiological input, as well as a list of notifications and what messages were ultimately sent to the user. This allows an operator to see the efficacy of a system and what information a user did and did not receive.
As computing devices continue to battle for users’ attention, Phylter limits less significant notifications that distract the users from focusing on a single task. It serves as a framework to prevent information overload by modulating the display of notifications to the user. While our initial setup is designed for fNIRS brain data and Google Glass, it uses generic network protocols and a framework that can be extended to other input or output devices.
Phylter is composed of several self-contained systems which communicate with each other wirelessly. As these components and their requirements reduce in size and computational and power requirements, we envision that this system could become completely portable and run with commercial electronics in the near future. With future improvements to the system, Phylter could control not only the timing of notifications, but the delivery mechanism, distributing notifications across multiple wearable devices or even between devices  to balance user-awareness with the cost of interruption.
This software suggests an important step in physiological-based notifications, and that turning notifications on and off can make a discernible difference. In order to assess the validity of this system, we plan on running a controlled laboratory experiment to see if user performance does indeed improve by the user only receiving pertinent notifications at opportune times.
We thank Shiwan Zuo, Beste Yuksel, Alvitta Ottley, Eli Brown, Fumeng Yang, Lane Harrison, Sergio Fantini, and Angelo Sassaroli from Tufts University, Erin Solovey from Drexel University, and Michael Rennaker, Timothy Jordan, and Alex Olwal from Google. We also thank Google Inc. and the NSF for support of this research (NSF Grants Nos. IIS-1065154 and IIS-1218170). Any opinions, findings, conclusions, or recommendations expressed in this paper are those of the authors and do not necessarily reflect the views of Google Inc. or the National Science Foundation.
- 1.Afergan, D., Peck, E.M., Solovey, E.T., Jenkins, A., Hincks, S.W., Brown, E.T., Chang, R., Jacob, R.J.K.: Dynamic difficulty using brain metrics of workload. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 3797–3806. ACM (2014)Google Scholar
- 2.Afergan, D., Shibata, T., Hincks, S.W., Peck, E.M., Yuksel, B.F., Chang, R., Jacob, R.J.K.: Brain-based target expansion. In: Proceedings of the ACM Symposium on User Interface Software and Technology, pp. 583–593. ACM (2014)Google Scholar
- 6.Chen, D., Vertegaal, R.: Using mental load for managing interruptions in physiologically attentive user interfaces. In: CHI 2004 extended abstracts on Human Factors in Computing Systems, pp. 1513–1516. ACM (2004)Google Scholar
- 7.Cutrell, E.B., Czerwinski, M., Horvitz, E.: Effects of instant messaging interruptions on computing tasks. In: CHI 2000 extended abstracts on Human Factors in Computing Systems, pp. 99–100. ACM (2000)Google Scholar
- 8.Cutrell, E., Tan, D.: BCI for passive input in HCI. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM (2008)Google Scholar
- 9.Girouard, A., Solovey, E.T., Hirshfield, L.M., Chauncey, K., Sassaroli, A., Fantini, S., Jacob, R.J.K.: Distinguishing difficulty levels with non-invasive brain activity measurements. In: Gross, T., Gulliksen, J., Kotzé, P., Oestreicher, L., Palanque, P., Prates, R.O., Winckler, M. (eds.) INTERACT 2009. LNCS, vol. 5726, pp. 440–452. Springer, Heidelberg (2009) CrossRefGoogle Scholar
- 12.Hirshfield, L.M., Solovey, E.T., Girouard, A., Kebinger, J., Jacob, R.J.K., Sassaroli, A., Fantini, S.: Brain measurement for usability testing and adaptive interfaces: an example of uncovering syntactic workload with functional near infrared spectroscopy. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 2185–2194. ACM (2009)Google Scholar
- 13.Horvitz, E., Apacible, J.: Learning and reasoning about interruption. In: Proceedings of the International Conference on Multimodal Interfaces, pp. 20–27. ACM (2003)Google Scholar
- 15.Iqbal, S.T., Bailey, B.P.: Effects of intelligent notification management on users and their tasks. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 93–102. ACM (2008)Google Scholar
- 16.Iqbal, S.T., Zheng, X.S., Bailey, B.P.: Task-evoked pupillary response to mental workload in human-computer interaction. In: CHI 2004 extended abstracts on Human Factors in Computing Systems, pp. 1477–1480. ACM (2004)Google Scholar
- 18.Jones, B., Hesford, C.M., Cooper, C.E.: The use of portable NIRS to measure muscle oxygenation and haemodynamics during a repeated sprint running test. In: Huffel, S.V., Naulaers, G., Caicedo, A., Bruley, D.F., Harrison, D.K. (eds.) Oxygen Transport to Tissue XXXV, pp. 185–191. Springer, New York (2013)CrossRefGoogle Scholar
- 19.Maior, H.A., Pike, M., Sharples, S., Wilson, M.L.: Examining the reliability of using fNIRS in realistic HCI settings for spatial and verbal tasks. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (In Press). ACM (2015)Google Scholar
- 20.Mandic, D.P., Obradovic, D., Kuh, A., Adali, T., Trutschell, U., Golz, M., De Wilde, P., Barria, J.A., Constantinides, A.G., Chambers, J.A.: Data fusion for modern engineering applications: an overview. In: Duch, W., Kacprzyk, J., Oja, E., Zadrożny, S. (eds.) ICANN 2005. LNCS, vol. 3697, pp. 715–721. Springer, Heidelberg (2005) Google Scholar
- 21.Miyata, Y., Norman, D.A.: Psychological issues in support of multiple activities. In: Norman, D.A., Draper, S.W. (eds.) User Centered System Design: New Perspectives on Human-Computer Interaction, pp. 265–284. Lawrence Elbaum Associates, Hillsdale (1986)Google Scholar
- 23.Parasuraman, R., Caggiano, D.: Neural and genetic assays of human mental workload. In: McBride, D.K., Schmorrow, D. (eds.) Quantifying Human Information Processing, pp. 123–149. Rowman & Littlefield Publishers Inc., Lanham (2005)Google Scholar
- 24.Peck, E.M., Afergan, D., Jacob, R.J.K.: Investigation of fNIRS brain sensing as input to information filtering systems. In: Proceedings of Augmented Human International Conference, pp. 142–149. ACM (2013)Google Scholar
- 26.Pierce, J.S., Nichols, J.: An infrastructure for extending applications’ user experiences across multiple personal devices. In: ACM Symposium on User Interface Software and Technology, pp. 101–110. ACM (2008)Google Scholar
- 30.Shibata, T., Peck, E.M., Afergan, D., Hincks, S.W., Yuksel, B.F., Jacob, R.J.K.: Building implicit interfaces for wearable computers with physiological inputs: zero shutter camera and phylter. In: Proceedings of the adjunct publication of the ACM Symposium on User Interface Software and Technology, pp. 89–90. ACM (2014)Google Scholar
- 31.Solovey, E.T., Afergan, D., Peck, E.M., Hincks, S.W., Jacob, R.J.K.: Designing implicit interfaces for physiological computing: guidelines and lessons learned using fNIRS. ACM Trans. Comput.-Hum. Interact. (TOCHI) 21(6), 35 (2015)Google Scholar
- 32.Solovey, E.T., Girouard, A., Chauncey, K., Hirshfield, L.M., Sassaroli, A., Zheng, F., Fantini, S., Jacob, R.J.K.: Using fNIRS brain sensing in realistic HCI settings: experiments and guidelines. In: Proceedings of the ACM Symposium on User Interface Software and Technology, pp. 157–166. ACM (2009)Google Scholar
- 33.Solovey, E., Schermerhorn, P., Scheutz, M., Sassaroli, A., Fantini, S., Jacob, R.J.K.: Brainput: enhancing interactive systems with streaming fNIRS brain input. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 2193–2202. ACM (2012)Google Scholar
- 37.Tremoulet, P., Barton, J., Craven, P., Gifford, A., Morizio, N., Belov, N., Stibler, K., Regli, S.H., Thomas, M.: Augmented cognition for tactical Tomahawk weapons control system operators. In: Schmorrow, D., Stanney, K., Reeves, L. (eds.) Foundations of Augmented Cognition, pp. 313–318. Strategic Analysis Inc., Arlington (2006)Google Scholar
- 41.Zander, T.O., Kothe, C., Welke, S., Roetting, M.: Utilizing secondary input from passive brain-computer interfaces for enhancing human-machine interaction. In: Schmorrow, D.D., Estabrooke, I.V., Grootjen, M. (eds.) FAC 2009. LNCS, vol. 5638, pp. 759–771. Springer, Heidelberg (2009) Google Scholar