Guiding robot planes with hand gestures

Gesture recognition is a novel approach to human-computer interaction that allows you to use your natural body movement to interact with computers. Because gestures are a form of human communication that is natural and expressive, they allow you to concentrate on the task itself, using what you already do, rather than having to learn new ways to interact. Our goal is to enable unmanned vehicles to recognize the aircraft handling gestures already made by deck crews. The aircraft handling gestures use both body posture and hand shapes; so it is important for our system to know both information. My research concentrates on developing a vision-based system that recognizes body and hand gestures from a continuous input stream. My system uses a single stereo camera to track body motion and hand shapes simultaneously and combines this information together to recognize body-and-hand gestures. We use machine learning to train the system with lots of examples allowing the system to learn how to recognize each gesture. There are four steps that our system takes to recognize gestures. First, from the input image obtained from a stereo camera, we calculate 3D images and remove the background. The second, our system estimates 3D body posture by fitting a skeletal body model to the input image. We extract various visual features, including 3D point cloud, contour lines and the history of motion. These features are computed both from the image and the skeletal model. Then, the two sets are features are compared allowing our program to come up with the most probable posture. The third [step], once we know the body posture, we know approximately where the hands are located. We search around each of the estimated wrist positions, compute visual features in that region and estimate the probability that what we see there is one of the known hand shapes used in aircraft handling. For example: palm open, closed, and thumb up and thumb down. As the last step, we combine the estimated body posture and hand shape to determine gestures. We collected twenty-four aircraft handling gestures from twenty people, giving us four hundred sample gestures to use to teach the system to recognize the gestures. We use a probabilistic graphical model called a Latent Dynamic Conditional Random Field. This model learns the distribution of the patterns of each gesture as well as the transition between gestures. We use this with a sliding window to recognize gestures continously and apply the multi-layered filtering technique we developed to make the recognition more robust. There is still a considerable amount of work to be done in the field of gesture recognition. Things we continue to work on include improving the reliability, adaptability to new gestures and developing appropriate feedback mechanisms; for example the system can say, “I get it” or, “I don’t get it.”

Leave a Reply

Your email address will not be published. Required fields are marked *