Monday, September 1, 2008

Visual Similarity of Pen Gestures

A. Chris Long, Jr., James A. Landay, Lawrence A. Rowe, and Joseph Michiels

Summary
The increasing popularity of pen (stylus) based computer input necessitates a study as to the difficulties associated with learning new gestures to mimic the natural way we write on paper. This paper conducted two studies to establish why users found gestures similar and then took that data and built a gesture design tool to help gesture designers build gesture sets that are easy to remember/perform.
As the heart of this paper is about the experiments, I will forego the "Related Work" section which basically describes the pen interface and the machines that use them, perceptual similarity (changes in geometric and perceptual properties influence the perceived similarity) and multi-dimensional scaling(a way of reducing the data set to allow for pattern recognition in 2- or 3-d).
Trial 1
The purpose of this trial was to determine what measurable geometric properties influenced their perceived similarity (accomplished with MDS) AND to produce a model that could predict how two people would view a similar gesture (accomplished with regression analysis). A gesture set with a wide degree of separability was constructed. Subjects were given a "triad" (three randomly chosen gestures; 364 triads shown to each user) and asked to pick the one that was most different. The 21 users' responses were used to construct the dissimilarity matrix.
The model they built was able to correlate gesture similarities at 0.74.
Trail 2
The purpose of this trial was to study three things:
1- total absolute angle and aspect
2- length and area
3- rotation related features
Because the feature sets became to large to show all possible combinations of the three sets built, a fourth feature set containing features from the first three was built. Subjects where then shown triads from all 4 sets for a total of 538 triads. The model they built using MDS and regression gave a correlation of 0.71.
When the model from the first trial was used on the data of the second trial only a correlation of 0.56 was found and it was even worse for model two on data 1.

Discussion
The goal of the experiments and the way they were conducted was great. I am a big fan of getting people to use the systems we build in order to evaluate them. After all, no matter how clever we think our solutions are, we are often times too close to them to see them objectively. That being said, I am not satisfied that the second part of their research goal was accomplished. For a specific environment being able to correlate at 0.75 might be adequate, but as the goal was to build this tool for any gesture environment, only having a correlation of 0.54 means that the system is just "guessing" and not providing the insights a designer would want/need. Secondly, the authors mention that in their regression analysis they remove features that were "only obtainable by subjective judgement". I am not exactly sure what that means, but it makes their numbers of 0.75 and 0.71 seem a little weaker to me.
I think it would be interesting to repeat this experiment to determine if gender/race/left handed vs right handed people aggregate into similar gesture classifiers.

1 comment:

Unknown said...

Thanks a lot for blogging about my work. (I'm Chris Long, the primary
author.) It's great to see that people are still interested in my
gesture research. I agree with your discussion that the models are
not as predictive as we'd like them to be, but I think you might be
underestimating how good they were. You said "a correlation of 0.54
means that the system is just 'guessing'", but that's not
accurate. Correlation is measured from -1 to 1, so correlation of 0
would be purely random (and totally non-predictive). Correlation of
0.54, however, is significantly better than random (even if not as
good as we'd like), and much better than just a guess. This
experiment was part of my dissertation work, and according to a
psychology professor (and expert on perception) who was on my thesis
commitee, this level of correlation is pretty respectable for a model
of human perception.

BTW, I incorporated the results of this research a tool for designing
pen-based gestures, called Quill. I haven't worked on it for ages, but
it's still available on SourceForge. It also included advice about
when the gesture recognizer was likely to get confused by gestures in
the same set, and I think this advice is definitely useful.

If you'd like more info, let me know and I can send you PDFs or links.