Monday, September 1, 2008

Specifying Gestures by Example

Dean Rubine
Computer Graphics, Vol. 25, Num 4, July 1991

Summary
Rubine built a hand gesture recognizing architecture called GRANDMA (Gesture Recognizers Automated in a Novel Direct Manipulation Architecture). With this architecture he developed GDP, which is a single stroke gesture recognizer for a small set of gestures.
GDPs gestures are modeled with GRANDMA.
The heart of this paper is in the way the author describes the feature set, the classification, the training, and the rejection so that is what I will focus on.
Feature Set
Done in constant time regardless of the gesture size there are 13 features that are classified. The features are all mathematical functions involving slopes, sums and angles. Refer to the paper for the formulae.
Gesture Classification
Gesture class c has weights that maximize the gesture class.
Training
Training is accomplished with a linear discriminator used to find the weights. Basically, a set of test data determines the feature vectors for a given gesture class by averaging the training data feature sets per gesture. The weights are finally calculated via some statistics using the covariance and inversion of the covariance matrix.
Rejection
Rejection is the act of intentionally not classifying a gesture due to ambiguities. This is necessary because linear discriminators will always classify. Rejection is not necessary as long as a robust undo capability is provided.$$
The great thing about this technique is that it is very accurate for small gesture sets. The author provides some evaluations that show for gesture sets less than 26 the accuracy is better than 98%.

Discussion
As the first true gesture recognition paper I have read I am very excited about the potential. While this paper basically focused on the way something is drawn I would anticipate that there are numerous collisions between mathematically similar shapes to be free in the sketch process. As an editor, it makes sense that a small number of actions need to be recognized, but as a free "piece of paper" the amount of training would be mind boggling. However, I enjoyed this paper as a first step toward targeted recognition and hope that there are some other means of generalizing.

1 comment:

andrew said...

I agree. Rubine's algorithm is definitely designed for using gestures to invoke actions rather than create recognized visual objects on a free "piece of paper". In pen-only interfaces, these type gestures are important, as they allow the user to interact with the system in a way that is fluid with the pen interface, as opposed to a mouse-based interface which uses menu bars, tool bars, and right-click pop-up menus.