Modern commercial wearable devices are widely equipped with inertial measurement units (IMU) and microphones. The motion and audio signals captured by these sensors can be used for recognizing a variety of user physical activities. Compared to motion data, audio data contains rich contextual information of human activities, but continuous audio sensing also poses extra data sampling burdens and privacy issues. Given such challenges, this paper studies a novel approach to augment IMU models for human activity recognition (HAR) with the superior acoustic knowledge of activities. Specifically, we propose a teacher-student framework to derive an IMU-based HAR model… Read more
Automatically recognizing a broad spectrum of human activities is key to realizing many compelling applications in health, personal assistance, human-computer interaction and smart environments. However, in real-world settings, approaches to human action perception have been largely constrained to detecting mobility states, e.g., walking, running, standing. In this work, we explore the use of inertial-acoustic sensing provided by off-the-shelf commodity smartwatches for detecting activities of daily living (ADLs). We conduct a semi-naturalistic study with a diverse set of 15 participants in their own homes and show that acoustic and inertial sensor data can be combined to recognize 23 activities… Read more
Acoustic sensing has proved effective as a foundation for numerous applications in health and human behavior analysis. In this work, we focus on the problem of detecting in-person social interactions in naturalistic settings from audio captured by a smartwatch. As a first step towards detecting social interactions, it is critical to distinguish the speech of the individual wearing the watch from all other sounds nearby, such as speech from other individuals and ambient sounds. This is very challenging in realistic settings, where interactions take place spontaneously and supervised models cannot be trained apriori to recognize the full complexity of dynamic social environments. In this paper, we introduce a transfer learning-based approach to detect foreground speech of users wearing a smartwatch…. Read more
Conversational assistants in the form of stand-alone devices such as Amazon Echo and Google Home have become popular and embraced by millions of people. By serving as a natural interface to services ranging from home automation to media players, conversational assistants help people perform many tasks with ease, such as setting times, playing music and managing to-do lists. While these systems offer useful capabilities, they are largely passive and unaware of the human behavioral context in which they are used. In this work, we explore how off-the-shelf conversational assistants can be enhanced with acoustic-based human activity recognition by leveraging the short interval after a voice command is given to the device… Read more