Perceptual Analysis of Talking Avatar Head Movements: A Quantitative Perspective

Xiaohan Ma, Binh Huy Le, and Zhigang Deng

SIGCHI International Conference on Human factors in computing systems (CHI), 2011

The left panel shows our user study setup. The right panel shows the frequency-domain analysis results.


Abstract: Lifelike interface agents (e.g. talking avatars) have been increasingly used in human-computer interaction applications. In this work, we quantitatively analyze how human perception is affected by audio-head motion characteristics of talking avatars. Specifically, we quantify the correlation between perceptual user ratings (obtained via user study) and joint audio-head motion features as well as head motion patterns in the frequency-domain. Our quantitative analysis results clearly show that the correlation coefficient between the pitch of speech signals (but not the RMS energy of speech signals) and head motions is approximately linearly proportional to the perceptual user rating, and a larger proportion of high frequency signals in talking avatar head movements tends to degrade the user perception in terms of naturalness.


Download: [paper] [video]


Bibtex

@inproceedings{XiaohanMa:CHI:2011,
author = {Ma, Xiaohan and Le, Binh Huy and Deng, Zhigang},
title = {Perceptual analysis of talking avatar head movements: a quantitative perspective},
booktitle = {Proceedings of the SIGCHI Conference on Human Factors in Computing Systems},
series = {CHI '11},
year = {2011},
isbn = {978-1-4503-0228-9},
location = {Vancouver, BC, Canada},
pages = {2699--2702},
numpages = {4},
url = {http://doi.acm.org/10.1145/1978942.1979337},
doi = {10.1145/1978942.1979337},
acmid = {1979337},
publisher = {ACM},
address = {New York, NY, USA},
keywords = {and audio-head motion features, head motion, perceptual modeling, quantitative analysis},
}